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ABSTRACT 

This technical summary describes the changes i i 
science performance on exercises included in both the first and 
second science assessments and on exercises included in both the 
second and third science assessments conducted by the National 
Assessment of Educational Progress (NAEP) • Osing the same exercises 
for adjacent assessments, with some exercises common to all three^ 
National Assessment was able to measure improvements and declines in 
achievement between 1969-70 and 1976-77. Each assessment utilized a 
deeply stratified, multistage probability sample design and a 
professional data collection staff. To the extent possible, 
administration conditions were kept constant across assessments, ihe 
document contains a brief introduction and four chapters: chapter 1 
contains background information about the project, chapter 2 presents 
national results by age levels (9 years, 13 years, 17 years), chapter 
3 reports group results for 9-, 13- dpA 17-year olds, and chapter 4 
contains a discussion of the adult science assessments. Four 
appendices are included:. (1) A-Technical Procedures: Sampling and 
Estimation of Standard Errors; (2) B-Estimated Population Proportions 
of Reporting Groups Based on National Assessment Samples, 1969-70, 
1972-73, and 1975-76; (3) C-Changes in^ Procedures Between 
Assessments; and (4) D-Nonresponse in Assessment Samples. (PSB) 
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FOREWORD . • . ■ ' 

Wffen the U.S. Office of Education was chartered in 1867, one charge to 
its commissioners Was to determine the nation's progress in education. The 
National Assessment of Educational Progress (NAEP) was Initiated a century ' 
later to address, -in a sys1;jematic way, that charge. 

Each year since 1969, National Assessment has ^g^thered -information about 
levels of educational achievement across the country and neported its- findings 
to the nation. ' NAEP surveys the educational attainments of. 9-year-ol<is, 13- ^ 
year-olds, 17-yearTmJs and young adults, ages 26-35,. in 10 learning areas: 
art, career, and occupational development, citizenship, literatuvle, mathematics, 
music,, reading, science, sQpial^studies and writing. Different l.ea):ning areas 
are-assessed. every year, and alT areas are periodically reassessed in order to 
measure change in educational achievement. NationalAssessment has interviewed 
and tested more than 720,000' young Americans since 1969. , * ' 

Learning-area assessments evolve from a consensus process. -Each assess- 
ment is. the product of several' years of work by a great many educators, schol- 
ars and lay persons from all over the nation. Initially, these people design 
objectives for each subject area, proposing general goals* they feel, Americans 
Should be achieving in the. course of their education. After careful reviews, 
these obj;fectives are given to exercise (item,) writers, whose task it is to 
create meivsurement tools appropriate to the objectives. • • - 

When the exercises have passed extensive reviews by subject-matter spe- 
cialists, measurement experts and lay persons, they are aciministered to prob- 
ability samples. The people in fhese samples are chosen in such a way that 
they representythe national population. Therefore-^ on the basis of the per- 
formance, bf. about 2,'500 9.-year-olds on a. given exercise, we can make generaliza- 
tions atout/the probable .achievement of all 9-year-olds in., the nation. Per- 
formance is' reportfe^i' in -terms of the percentages of young people correctly an- 
'swering a givdn^ exercise or set of exercises; changes in performance** are the 
differences between. the percentages of young people correctly answering a 
given exercise or set of exercises from one point in time, to another. 

/■ ' 

After assessment data have been collected, scored and analyzed. National 
Assessment publishes reports^ to disseminate the results as widely as possible. 
Not all exercise,*; are released for publ ication. Because NAEP will readminister 
some of the same exercises in the future' to 'determine whether the performance 
level ^ of Americans has increased, decreased or remained the same, it is essen- 
tial that, they, not be released in order, to preserve the integrity of the_ study. 

See the inside back cover of this report for a complete listing of addi- 
tional reports on science assessments..* 
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INTRODUCTION 



The National Assessment of Educational Progress has completed three assess- 
ments of science. They were conducted in 1969-70, 1972-73 and 1976-77.^ 

This report summarizes changes in science performance on exercises included 
in both the first and second science assessments and on exercises included in 
both the second and third science assessments. Using the same exercises for 
adjacent assessments, with some exercises common to all three. National Assess- 
ment was able to measure improvements and declines in achievement between 
1^69-70 and 1976-77. 

Each assessment utilized a deeply stratified, multistage probability sam- 
ple design and a professional data collection staff. To the extent possible, 
administration conditions were kept constant across assessments. Appendix C 
documents the procedural changes that have occurred between the first and 
third assessments. 

Changes in science performance in this report have been summarized accord- 
ing to the 1972-73 science objectives^ and by type of science (content). The 
content clusters comprise biology, physical science and other, or unclassified. 
An additional summary has been included for exercises that were administered 
in all three assessments of 9-, 13- or 17-year-olds enrolled in school. 

National Assessment has published a number of reports related to science. 
A complete list is included on the inside back cover of this report. Reports 
most relevant to this technical summary include: 

• Report 1 " Science: National Resul ts (July 1970) . Contains released 
exercises from the first science assessment and technical documentation 
of methodology. 



^^The assessment schedule varied for each age level. The actual administration 
dates were: 

Age 9: January through February 1970, 1973 und 1977 

Age 13: October through December 1969, 1972 and 1976 

Age 17: March through May 1969, 1973 and 1977 ■ 

Young 7\dults: October 1972 through May 1973 and May through July 1977 

^ Science Objectives for 1972-73 Assessment (Denver, Colo.: Education Commis- 
sion of the States, National Assessment of Educational Progress, 1972). 
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• Report 04-S-20 — Changes in Science Performance, 1969 -73: Exercise 
VoTume (December 19/5J.. Contains change exercises that were released 
after the 1972-73- science assessment with national results for all 
responses- and correct response results for sex, race and region. 

' Report .04-S-20 - Changes in Science Performance, 1969-73 : Exercise 
Volume, Appendix (two volumes, April 1977). Contains all exercises re- 
leased after the 1972-73 science assessment, with percentages and 
standard errors as well as change statistics for region, sex, race, 
parental education and size and type of community. 

• Report 04-S-21 — Science Technical Reg^rt: Summary Volume (May 1977) 
Contains detailed methodological documentation of the 1969-70 and 1972-73 
science assessments as well as summary data for objectives and content 
classifications. 

• Report 03/04-GIY — General Information Yearbook (December 1974) Con- 
tains a condensed description of National Assessment methodology with 
emphasis on the 1971-72 and 1972-73 assessments. 

• Report 08-S-OO — Three National Assessments of Scien ce: Changes in 
Achievement, 1969-// (June 1978). Cont ains a capsule description of 
changes in science achievement between 1969 and 1977 with interpretive 
comments by a group of science educators. 

• "["t^^J^^'*"^ Assessment of S cience, i976-77: Released Exercise S et (May 
19/8j. Contains exercises released after the 1976-77 science assess- 

TQSJ'7n"''^li''Jo^o^^r'''^^^ "^^"^ *° measure changes in achievement from 
1303-/1) and 1972-73. 

• Technical Appendix to: the Third Assessment of .Sciencev 1976-77- Released 

Exercise Set (December 1978J. Contains 1976-77 percentages of correct 

responses and standard errors for. correct responses to all released cog- 
nitive exercises. Variables include race, sex, region, community size 
and grade. 



- Organization of the Report 

thP J^LV/IL^'^I^^^'' presents a history of the development of 

the science objectives and exercises and describes procedures for sampling, 
data collection, scoring and analysis. ' a> • 

The second chapter summarizes changes in mean percentages of acceptable 
response: for each in-school age group. Summaries are presented for all exer- 

rilll ^^^^-^^ science objectives, content categories and exer- 

cises administered in all three science assessments. 

The third chapter describes changes in performance for various school - 
age subpopulations: geographic region, sex, race, level of parental education, 
type of community, size of community and grade Jn school • 
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The fourth chapter describes changes in performance for young adults, 
ages 26-35, between 1973 and 1977. 
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CHAPTER 1 
BACKGROUND 



History of Objectives and Exercise Development 

The exercises used to report changes in science achievement measure broad 
education objectives, which represent a consensus of educators, subject-matter 
experts and interested lay persons auout what young Americans should know and 
be able to do. These objectives are not an attempt to mandate behavior and 
value systems; rather, they represent- goals that a diverse group of people 
identified as desirable for young Americans to accomplish. 

Objectives for the 1969-70 science assessment were developed by the Educa- 
tional Testing Service in 1965.^ During 1969 through 1971, the objectives were 
reorganized for the 1972-73 assessment.^ The major 1972-73 objectives were: 

. >^ 

1. Know the fundamental aspects of science. 

2. Unders>tand and apply the fundamental aspects of science in a wide 
range of problem situations. - 

3. Appreciate the knowledge and, processr .^ . science, the consequences , 
and limitations of science, and -the p - jnal and. social relevance of 

' science and technology in our society. /- 

Subobjectives for each objective consisted of Fundamental Aspects if Sci- 
ence and the Scientific. Enterprise. Fundamental aspects were; further ,siibdi- 
vided into: facts and simple concepts, laws (principles)-, co'n'cef)tual schemes 
and inquiry skills. 1,/. 

■ ^ 

The number of exercises used to measure change between assessments by age 
group and objective is shown on the following page. 




^ Science Objectives > 1969-70 Assessment (Ann Arbor, -Mich. : . Committee on 
Assessing the Progress "of Education, 1969), available through the National 
Assessment offices. - 

^Science Objectives for 1972-73 "^Assessment (Denver, Colo..: Education Commis- 
sion of the States, National Assessment of Educational Progress, 1972). 



Assessments 


Age 


Know 


Understand/ 


Appreciate 


Tctal Number 








Apply 




of .Exercises 


1969-70 to 1972-73 


'9 


40 


47 


- 

5 


92 




13 


37 


28 


. 2. 


' 67 




17 


34 


30 


0 


' 64 


1972-73 to 1976-77 


.> 9 


. 32 


36 


3 


71 




13 


38 


37 


0 


75 




17 


31 


37 


2 


70 



The process of developing objectives and exercises to assess performance 
in a subject area across time is a difficult task. There must be a sufficient 
number of identical items to measure change reliably-; on the other hand, the 
assessment must keep current with changing curriculum objectives. Therefore, 
after each assessment some items are released to the public and some are kept 
secure for the purpose of measuring change. Before the next assessment, the 
objectives are reviewed and revised, and new items are written to measure the 
revised objectives. 

- For the 1976-77 assessment, a somewhat di^fferent approach to objectives 
development was taken. Science consultants and Natiorral Assessment staff 
agreed that the 1972-73 objectives represented an excellent statement of the 
purppses and goals of science education but were not specific enough to pro- 
vide a clear guide for writing assessment exercises. For assessment purposes 
^n^Tvo^""^"^^'""^^ grid was defined. 3 The first dimension is similar to the 
1972-73 objectives, with four levels: knowl edge ; comprehension ; application ; 
and analysis, synthe sis, and evaluation . The second dimension divides the do- 
main of science |nto three major areas.: content , the body of science knowl- 
edge; the process by which the body of knowledge comes about; and science and 
society, .the. implications of that body of .knowledge for mankind. Each of these 
IS further subdivided into specific components. Within each cell of the grid, 
specific objectives were developed to guide item development. 

While the 1976-77 objectives have not been used as a basis with which to 
summarize the changes in achievement from preceding assessments, they have been 
used to summarize cognitive achievenent in the 1976-77 assessment" and will be 
used to summarize change measures from 1976-77 to the next, assessment of sci- 
ence". . ■ ■ . 



^Science Ob jectives for the 1976-77 Assessment (Denver. Colo.: Education Com- 
mission of the States, National Assessment of E'ducational Progress, forthcoming). 

"Science Ac hievement in the Schools . Report 08-S-Ol. 197^-77 Assessment (Denver 
Colo.: Education Commission of the States, National Assessment of Educational 
Progress, 1978). 



Many people from across the country have been involved in the development 
of objectives and items for these assessments. Subject-matter specialists, 
measurement experts and lay persons not only helped develop the objectives, 
they also participated in reviewing and revising exercises. All newly developed 
Items were field- tested with students representative of high- and low-performing 
groups. Before and after each "tryout". assessment, the. exercises were discussed 
by panels of reviewers, many of whom represented minority groups, to guard 
against the possibility of racial, ethnic or sexual bias. 



Sampling and. Data Collection 

Each year National Assessment selects respondejits at ages 9, 13 and 17 
using a' deeply stratified, multistage probability sample design,^ This sample 
design guarantees that each respondent is selected with a known probability; 
hence, each respondent repre^^ents a known fraction of the entire population at 
that age level . By weighting each respondent's performance inversely to his or 
; her probability: of selection. National Assessment can make appropriate gener- 
alizations about the entire population of 9-year-olds, 13-year-olds and 17-year- 
olds enrolled in school . - 

National Assessment does not follow up specific .individuals from one 
assessment to the next. In other words, the students who participated in the 
1969-70 or the .1972-73 assessments are not; the same ones who participated in 
1976-77. However, in each assessment year,^ participants are carefully select- 
ed to represent each age level. For example, National Assessment assessed one 
probability sample of 9-year-olds to ascertain science achievemr-nt ifi 1970 
and totally different probability samples of 9-year-olds .in 1973 and 1977. 
Each was .a^ sample of the population of ^students who were 9 years old during 
that assessment year. ; Thus, when we say that 9-year-olds' achievement declined 
between 19/0 and 1973, we mean that students who- were 9 years old in 1970 cor- 
rectly answered the same questions more often than those who were 9 years old 
in 1973. 

. the three school-age populations selected for each of the science assess- 
ments were defined as follows:^ . - 



^See, Appendix A for technical details about National Assessment sampling pro- 
cedures. 



Age 



1969-70 
Assessment 



1972-73 
Assessment 



1976-77 
Assessment 



9 


3orn in 1960 - 


Born in 1963 


Born in 1967 


13 


Born in 1956 


Born in 1959 


Born in 19^3 


-17 


Born between 
V October 1^51 and 
September 1952 


Born between 
October 1955 and 
September 1956 


Born between 
October 1959 and 
September 1960 



The populations were further restricted to students enrolled in public or pri~'" 
vate schools, who were neither in institutions nor too functionally handicapped 
to respond to assessment exercises.® 

Once the exercises were selected, for tht: assessments, they were assembled 
into booklets that were administered to probfibility samples of each appropri- 
,ate. age group.. Not all students responded to all exercises. Each booklet or 
group of exercises was administered, to a representative sample of about 2,500 
9-, 13r or 17-year-olds. The approximate numbers of respondents who partici- 
pated in the science assessments are shown in Table 1-1. 



Age- 
9 

13 
17 



TABLE 1-1.^ Numbers of Respondents for the 
Science Assessments, Ages '9, 13 and 17 



1969-70 

19,468 
21,696 
22,913 



1972-73 

20,852 
23,507 
25,865 . 



1976-77 

17,345 . 

25,653 

29,140 



In order for an assessment to measure changes in performance' reliably i it 
must replicate testing conditions as nearly as possible. Thus, items used to 
measure change are as nearly identical in wording and format in each assessment 
as IS possible. National Assessment further attempts to keep administration 
procedures constant by tape-recording instructions and items and by using 
trained administrators, rather than classroom personnel, to conduct assessments 
A discussion of changes that have taken place over the course of the three 
assessments can be found in Appendixes A and C. 



The 1969 and 1973 assessments of 17-year-olds included samples of dropouts , 
and early, graduates. Funding limitations precluded a similar sample in 197"^ 
Thus, results in this report are limited to 17-year-olds enrolled in school. 



Scorinig 



It is also essential that identical scoring procedures be used in each • 
assessment if data are to be used to measure change. Both multiple-choice and 
open-ended exercises were includGd in the science assessments. Not more than 
six open-ended exercises per age were included in change summaries for 1972-73 
to 1976-77. ^ One open-ended exercise for l7-year-olds was included in 1969-70 
to 1972-73 summaries. Individually administered experiments were included in 
both the 1969-70. and 1972-73 assessftients. Because of technical difficulties 
with apparatus and scoring protocols, none of the experiments were included in 
change summaries for the first two assessments. Funding limitations precluded 
the use of individually administered experiments in the 1976-77 assessment. 

. Responses to multiple-choice items were marked directly in the. assessment 
booklets. The booklets were optically scanned and edited by both computer and 
scoring staffs to ensure reliable scoring. 

Only about three to fiye open-ended. exercises per age group were available 
for measuring change between the 1969-70 , and 1972-73 assessinents. One exercise 
was rescored for age 17 and included in change suiPinaries. The remainder were 
omitted from summaries because of the questionable compar*£ibi I Sty of scoring 
procedures. 

Scoring comparability for open-ended items v;as achieved between 1972-73 
and 1976-77 by> rescoring the 1972-73 responses simultaneously with the scoring 
of 1976-77 responses! Four highly trained scorers with previous assessment 
scoring experience coded- the responses for each age group as assessment book- 
lets were received from the data collection staff. . . 

Scoring for^ach age group took 8 to ir< weeks. At the beginning of scor- 
ing for each age group, the scorers were trained by the Measurement Research 
Center Scoring director, a science consultant and the NationalAssessment sci- 
ence analyst. The scoring guide for-each exercise. was presented and discussed. 
Sample responses from both the 1972-73 and 1976-77 assessments •were independent- 
ly coded by both scorers and trainers and scores were compared for consistency. 
Scoring guides were clarified and revised, if necessary, and nio re sample re- 
sponses were scored until near-perfect- consistency was achieved. 

■: m ■ • ■ 

To help maintain quality control and identrTy problems, 10% of each scor- 
er's work was independently scored by another, usually within one or two weeks 
of each other. Agreement betvr'een scorers, on. about 250 to 260 responses per 
exercise, ranged from 96 to 100% on the open-ended exercises included in change 
summaries, as shown below. - 



Age 



Number oi' 
Exercisas 



Range of Percent of 
Agreement on 10% Subsample 



9 

13 
17 



5 
6 
6 



96 to 100% 

96 to 9S% 

97 to 100% 



These figures indicate how consistently a small group of highly trained scorers 
can score the same set of papers. 



Measures of Achievement 



The basic moasure of achievement reported by National Assessment is the 
. percentage responding acceptably to a given item. This percentage is an esti- 
mate. of the percentage of 9-, 13- or 17-year-olds who would respond acceptably 
to a given item if every 9-, 13- or 17-year-old in the country were assessed. 

Percentages of correct responses are used because each item is designed 
as^.a separate measure of some aspect of an objective or subobjective. The pur- 
pose of National Assessment is to discover if more or fewer people are able^to 
answer these items correctly - and thus meet the objectives - over the years. 

% Procedures for estimating percentages of acceptable responses to exercises 

fnH • V!! ^^'I'P^^ ''^^'"S"- Eacfi response by an individual is weighted 
and multiplied by e.n adjustment factor for nonresponse. ^ An estimate of the 
percentage of a particular age group' that would have. responded to an exercise 
acceptably if the entire age group. were assessed is defined as the weighted 
number of acceptable responses divided by the weighted number of all responses 
A similar ratio of weights is used to estimate percentages of acceptable re- 
sponses for reporting groups or subpopulatioris of interest. 3 

; The difference between the percentage of acceptable responses for a report- 
ing group and that of the entire age group on an exercise describes the oer- 
formance of any reporting group relative to the entire age group. This differ- 
ence IS a positive number if the group achieves a higher percentage than the 
entire age group and is a negative number if the qroup achieves a lower per- 
centage. For example, a group performance of +1.8 indicates that the percent- 
age of acceptable responses for the group is 1.8 percentage points higher than 
the national percentage of acceptable responses for a. particular age level. 

, Increases or decreases iii thd' percentage of acceptable responses between 
two assessments ar^ estimated by finding the difference between percentages 
.Obtained from each assessment. A positive difference indicates an increase 



^Appendix D. discusses nonresponse in assessment samples. 

^Following the 1976-77 assessment, a weighting-class adjustment procedure was 
used^to dampen fluctuations in estimated population proportions across the 
eight assessments conducted between 1969-70 and 1976-77. Documentation of 
this procedure and estimated population proportions are incluaed in Appendix 
B. Consequently, the estimated ^percentage of correct resjjonses in this 
report and Three National Assessments of Scien ce: Changes in Achievement, 
i969::77. Report 08-S-OO, may deviate :sl.ightly from the figures in earlier 
science change reports. ' earner 



and a negative difference indicates a decrease in the percentage of students 
who responded acceptably from one assessment to the next. These differences, 
or change measures, are used to indicate ;y:'ends in achievement, or performance, 
for an age level or subpopulation of in^^st.- Changes in group differences 
from the national performance between two assessments are used to indicate the 
relative trend of a group compared to the national trend of the age group. 

To present a general picture of changes in achievement, National Assessment 
^summarizes the gains or losses on each exercise (either for the entire learning 
/area or for some integral set of exercises) by using the mean, or arithmetic 
; average, of the changes in percentages of acceptable: responses to the exer- 
cises. During the first years of the assessment, the median was used as the 
principal summary measure. However, the mean was chosen as the principal sum- 
mary measure of change after extensive investigation showed empirically that 
-it was more suitable for National Assessment change data than alternative mea- 
sures. _ In addition, the mean is an easily understood and fairly wieTl-known 
statistic and has simple arithmetic properties useful for the analysis of dif- 
ferences or change measures — in particular, the difference between means is 
.the same -as the mean difference. This property allows us to describe accurate- 
ly the mean change as the difference between mean percentages of acceptable 
responses from one assessment to the next. Mean percentages for the science 
assessments are used throughout this report to simplify descriptions of change. 
Its use does not signify that the mean, is the best summary statistic to use in 
each, assessment separately, nor do we intend that the mean percentage should 
be construed as an average test score. 

Unless the items summarized in the mean percentages of acceptable respons- 
es are identical , the means of one age group should not be compared to the 
means of another, since their values reflect the choice of exercises in addi-. 
, tion to the performance of the students. When only a few exercise's are sum- 
marized by a mean, we should be especially cautious in interpreting results, 
since a small set of exercises might not adequately cover the wide range of 
potential behaviors included under a given objective or subobjecti ve. The 
mean should be interpreted literally as .the arithmetic average of the percent- 
age of acceptable responses obtained, from National Assessment samples on a 
specific set of exercises. 



- Twenty-two empirical distributions of change measures from the 1969-70 and 
1972-73 science assessments were used to generate- Monte Carlo simulations of 
sampling distributions for several measures of central location. In addition 
to the mean and median, other measures of central, location that were considered 
in the simulation studies included the average of the extremes, two forms of 
biweighted estimates and three forms of weight-matching estimators described 
by John W. Tukey in the research report, "Some Considerations on Locators 
Apt for Some Squeezed-Tail (and Stretched- Tail ) Parents" (paper prepared in 
connection with research at Princeton University. supported by the Army Research 
Office, summer 1975). In almost €very case, the" sampling stability of the 
mean change was as good as or better. than that of the other measures studied. 



In the analysis of National Assessment's achievement measures, notice that 
' the differences in performance among assessment years, among groups 'and among 
ages are most useful. By maintaining the same item or set of items in making 
^ these comparisons, we have a reasonable indicator of whether more or fewer 
■ people know or can do something judged important. 

Estimating Variability in. Achievement Measures 

National Assessment uses a nationa.l probability sample "at each age level 
to. estimate the proportion of people who would successfully complete an exer- 
cise. The particular sample selected is one of a large number of all possible 
samples of the same size that could" have been selected with the same sample de- 
Ji^^^ ^" achievement measure computed from each of the possible samples 
would differ from one sample to another, the standard error of this statistic 
-is used as. a measure of the sampling variability among achievement measures 
from all possible samples. A standard error, based on one particular sample, 
serves to estimate that sampl ing variabi 1 ity. 

In the interest of sampling and cost efficiencies. National Assessment 
uses a complex, stratified, multistage" probability sample design. Typically 
complex designs do not provide for unbiased or sim"ple computation of sampling 
errors. A reasonably good approximation of standard-error estimates of ac- 
ceptable response percentages is obtained by applying the jackknife procedure^" 
to tirst-stage sampling units within strata. Standard errors for achievement 
measures such as group differences, "mean percentages or mean group differences 
for a particular assessment year are estimated directly,- taking advantage of 
features of the jackknife procedure that are generic to all of these statis- 
Xtics. Since samples for different'' assessments are indepemient, the standard 
errors of the differences in achievement measures between assessments can be 
estimated simply by the -square root of the sum of squared standard errors from 
each of the assessments. 

The standard error provides an estimate of sampling reliability for the 
achievement measures used by National Assessment. It is comprised of sampling 
error and other random error associated with the assessment of a specific item 
-or set of Items. Random error includes all possibl e .nonsystematic error 
associated with administering specific exercises to specific students in spe- 



"^^'^ "A Trus.tworthy Jackknife," AnnalsofMathematical statistics 
^°yJV^^t^^\^^- 1594-1705; R.G. Miller Jr. , " Jackknif ing Variances' ' Anna s 
?ukpf^"?g.'rf St a4istics.,^No. 39 (1968), pp. 5.67-82; F. MosL^er and 
(t/lr^ Data Analysis Including Statistics," in Handbook of Social Psvcholo ov 
1968) ' ^' ^'""^'^^ (Readi ng, Mass: Addi son-Wesley, 

•m!?^?iJS''h!?''c^//°w ' "^''^ detailed description of National Assessment's com- 
putation of standard errors. 
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cific situations. Random differences er.iong scorers for open-ended items are 
also included in the standard errois. 

In this report, we designate. with an asterisk item differences or mean 
differences that are at least . twice as large as their standard errors. By so 
designating these differences, we are adopting the usual convention that' dif- 
ferences this large would occur by chance in fewer than 5% of all possible 
.replications of our sampling and data collection procedures; 



Coijtrolling Nonrandpm Errors . . 

Systematic errors can be introduced at any stage of an assessment — exer- 
cise development, preparation of exercise booklets, design ofadministration 
procedures, field administration, scoring or analysis. These nonsampling, 
nonrandom errors rarely can be quantified, nor can the magnitude of the bias 
they introduce into our estimates be evaluated directly. 

Systematic errors can be controlled in large part by employing uniform 
administration and scoring procedures and by requiring rigorous quality con- 
trol in all. phases of an assessment. If the systematic errors are the same, 
from age to age or group to group, then the 'differences in percentages or mean 
percentages are measured with reduced bias because. subtraction tends to cancel 
the effect of the systematic errors. 

Similarly, the effect of systematic errors in different assessment years- 
can be controlled by carefully replicating in the second asseissment the pro- ^ 
cedures carried out in the first. Differences in achievement across assessment 
years will also be measures with reduced bias since subtraction will again tend 
to cancel systematic errors. V& 

However, it is not possible for every condition; or procedure to remain the 
. same between assessments conducted several years apart. Improvements in field 
procedures and sample design have been made, school cooperation rates have im- 
proved slightly since the early assessments, packaging of exercises was not 
identical in each assessment, and shifts in the composition of categories of 
respondents have occurred over the yearSi.^^ 



^^AppendixC examines some of these changes and discusses the possible effects 
of these systematic errors on the results in this report. 



' • • ■ CHAPTER 2 

/ ■ . • ' .. NATIONAL RESULTS 

.This chapter .preservts national data on changes in science performance for 
9-, 13- and 17-year-olds. Results are summarized for all exercises used to 
measure change, from, 1969-70 to l'972-73 and from 1972-73 to 1976-77. They are 
also summarized by the 1972-73 objectives and type of science, as well as the 
sets of exercises common to all., three assessments. 

Discussion of results is minimized since National Assessment has published 
a major descriptive report based on these data.^ This chapter contains the re- 
sults presented in that report, plus supplementary information.- 

• .ResTults for 9-Year-Olds 

Table 2-1 contains the number of exercises, .means and standard errors for 
.each set of change exercises. Between- 1970 and 1973, science achievement of 
•9-year-olds declined on most of the summary measures. The decline was not 
significant for biology. or unclassified exercises, and the five exercises 
dealing with the objective of appretiation showed a -significant increase. 
There was no overall change between 1973 and 1977. Achievement on physical 
science exercises declined-significantly, but it" increased significantly' .on 
unclassified exercises while achievement on biology exercises increased by 
almost two standard errors. - The. three, appreciation exercises showed a signifi- 
cant increase in the percentage of correct responses. 

Results for. 13-Year-Olds 

Table 2-2 contains the. number of exerclsfis and summary results for each 
set of change-exercises. Re'sults- for 13-year-olds from 1969-72'were similar ' 
to those for 9-year-olds during, the same time period. There was'a significant 
overall decline in achievement.- The decline on biology exercises- was not sig- 
nificant, and performance on the unclassified science exercises, showed ho 
change, while there was a significant increase on the two exercises dealing 
■with the appreciation objective. Between 1972 and 1976 there was no overall 



^ Three National Assessments of Science: Changes in Achievement, 1969-77 . 
Report 08- S- 00 (Denver, Colo.: Education Commission of the States, Nationa 1 
Assessment 'of Educational Progress, 1978). 
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TABLE.2-1. Mean.Per'centages of Correct Responses in Three Assessments and Chanqes in ' 
Percentages for All Exercises and Selected Exercise. Classifications, Age 9 ' 

Mean % Correct • ■ Mean % Correct 



All exercises • 
Standard error 

Type of science 
Biology . 
Standard error 

Pjiysical science 
Standard error 

Unclassified 
Standard error 

1972-73 objective 
Know 

Standard error 

.. ■ . I 
Understand and apply 
. Standard error 

Appreciate 
Standard error 

'xerclses used in, all 
• three assessments 
standard error 



Number of 
Exercises 


1970 


1973, 


Clianbe 


92 


61.0 


59.8 

{ L\- 


* V • 

-1.2* 

( fil 

l.Dj 


27 


7n 4 

(.4) 


1: 

(.4) 


1 n 
-l.U 

(.6), 


50 ' 


56.7 


55.2 
.(.5J 


-1.5* ■■■ 
(.6) 


15 


58.-8 


58.6 • 

\.Dj 


- ,2 ■ 




DU. J 

(.3) 


Of ,0 

(.6) 


(.6), 

* 


47,/ ■ 


55,0 
-{.4)' 


54.6' ' 
(.5) 


(.6) : 


• 5 


/4.7 
.(.6) 


77,2 
(.7) 


-2.5* 
.(.9) ■ 


30 

9 


64.8 
(.4) 


63.7 
(.4) 


-U2* . 
(.6) 



Number of . 1973 1977 , Change 
Exer-cises' 



71 


■ 52.3. 


■ 52.2 


- .1 




(.4) 


■ (.4) 


(.6) 


24 


57.8 


59 2 


1 4 




(.4) 


(.6): 


(.7) 


42";^ 


47,5 


. 46,2 


■-1..3* 




(..4) , 


(.4) 


(.6) 


5 


66.3 


69.1 


2.8* 






(.8), 


(1.1) 


32' 


54.8 


54 2 






(.5) 


(.6) 


(.7) 


36 


' 47.7 


47,8 


- .2 




(.4) 


(.4) 


(.6)- 


3" 


82,0 


84,9 


3.0* 




(.7) 


(.9). 


(1.1) • 


3a 


63,7- 


62.9' 


- .8 




(.4) 


(.5) 





denotes differences greater than or equal to tuo, ■standard errors. ' 
ote: Computations, were performed prior to rounding to one deoiml place. 



I • , TABLE 2-2. Mean, Percentages of Correct Responses in Three Assessments and Changes in 

Percentages for All Exercises, and Selected Exercise Classifications, Age 13 

• • . Mean I Correct Mean % CoVec t , 

lumber of ,1969 1972 Change' flumber of 1972 1976 Change 

Exercises Exercises ~ 

iAll exercises ' 67 60.2 58.5 -1.7* 

iStandard errors ... ,. ^ (.4) (.5) (.6) 

jType of science 

[• Biology . / , 23 60.9 59.6 -1.3 

f, 'Standard error (.8) (.5.) (.7) 

Physical science 36: 59.7 57 l" -2 6* ' 

Standard error - (.4), (.5) (.7) 

Unclassified 8 64.7 65 4 7 

.Standard error • (.6) (.6) (i) 

jl972-73 objective , 

\ J^f^ow 37 . 60.0 58.5 -1.6* 

: Standard error . (.3) (.5)^ (.5) 

Understand and apply 28 59.9 57 6 -2 2* 

Standard error • , (.4) (.5) (j) 

^Appreciate . 2 66.0 69.8 3.9* 

\ Standard error . ; ; (.7) (.9) (ij) 

■i ' . ■ ■ • ■ . ' * 

■r 

^Exercises used in all 

j three assessments 13 63.3 61.4 -1 9* 

^Standard error . , • (.4) (.5) . (,5) 

'^denotes differences, gmtev. tkn or equal to two standard- errors'. 
Me: Computations me performed prior to rounding tc one demdl place. 

ERIC ' , . 2^ 



75 




^7 Q 


7 




(.4) 


(.4) 


(.6) 


23 


61.1 


62.0 


.9 




(.4) 


(.5) 


(.7) 


m 

f/ . 


OU#'f 


49.6 


- .8 




' (.4). 


(.4) 


(.6) 


J 




55.8 


-6.3* 




(.9) 


/(l.O). (1.3) 


38 


56.5 


56.4 


-1.1 




■(.4) 


(.5) 


(.7) 


37 


52.4 


52.2 


- .2 




(.4) 


(.4)- 


(.6)' 


0' 




mm 
mm 


>M m 


23 ' 


- 61,4 


59.7' 


-1.7* 




. (.5) 


(.5) 


(.7) 



change^ although performance on five unclassified science exercises declined, 
as it did on the exercises carried over from the 1969 assessment. 



Results for 17-Year-Olds 

Table 2-3. contains summary results for. 17-year-olds on each set of change 
exercises. Average achievement on all. exercises declined between 1969 and 
1973.^ These results are reflected In the other summaries for that time perio'd 
all three types of science, the two objectives for which exercises were avail- 
able and the exercises common to all three assessments. Results between 1973 
and 1977 were similar to those of the earlier time period. Achievement de- 
clined on all classifications of exercises, but declines were not significant 
for biology, the six unclassified science exercises or the two exercises mea- 
suring the appreciation objective. 

. ■ - . Summary 

IT 

-ncn achievement of 9-,' 13- and 17-year-o-lds declined between 

1969-/0 and 1972-/3 on- tfie overall summaries and most subclassifications 
That.trendxontinued for 17-year-olds but not for 9- and 13-year-olds between 
iy/^-/j and 1976- //._ Achievement on biology exercises appears to have stabil- 
ized, while the decline in achievement on physical science exercises might be 
slowing for 9- and l3-year-olds. A larger, liiore comprehensive set of science 
exercises will be available for measuring changes in performance between 1976- 
/7 and the next assessment. Thus, data from the larger, fourth assessment, 
combined with results from the first three, might clarify trends. 
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TABLE 2-3. Mean Percentages of Correct Responses in Three Assessments and Changes in 
Percentages for AH Exercises and Selected Exercise Classifications, Age.17 • 

. Mean % Correct ' ; Mean t Correct 

■ . .Number of 1969 1973 Chanse Number of 1973 1977' Change 
Exercises Exercises ~ ' ""^^ 



All exercises 
...Standard error : 

Type of science 
.Biology . 
Standard error 

Physical science 
Standard error 

Unclassified 
Standard error 

1972-73 objectives 
Know 

Standard error 

V Understand and apply 
!^ Standard error, 

Appreciate 
Standard error 

Exercises used in all 

three assessments 
Standard error 



64 


45.2 
(.3) 


42.5 
(.3) 


-2.8* 

(.5) 


70 

/u. 


tO.H 

(•4) 


(.4) 


on 

20 


52.3 
(.4) 


51.1 
(.4) 


-1.2* 

(.6) 


19 


53.3 
(.5) 


52.2 
(.5) 


39 


i'42.9 
■ (.4) 


39.3 
■(.4) 


-3.5* 

(.5) 


45 - 


to.o 


M.4) 


5 


^ 35.6 
(.6) 


32.1 
(.6) 


-3.5* ■ 
(.8) 


6 

,* 


44 8 
(.6) 


tJ.O 

(.7) 




49.9 
. (.4) 


47.0 
(.4) 


-2.9* 

(.5) 


31 


50."5 

(.4), 


.49.3 
(.4) 


30'. 


40.0 
(.4) . 


37.'3 
(.4) , 


2.7* 
(.6) 


37 


45.7 
(.4) 


43.3 
(.5) 


0 




• L 




' 2 


65.4 
(1.1) 


63.2 
(1.3) 


23 ". 


44.6 

. (.4)' . 


> 

42.3 
(.4) 


-2.3* ■ 
(.6) 


23 


42.3 
(.4) • 


39.9 
(.4), 



fete; Cornputations were peTfomed' pnor to founding to one deoiml place. 



CHAPTER 3 



GROUP RESULTS FOR. 9-, 13- AND 17-YEAR-OLDS 



This chapter contains definitions of National Assessment reporting groups 
and summary results for the full sets of exercises used to measure change from 
1969-70 to 1972-73 and from 1972-73 to 1976-77.. Respondents were classified 
-by their sexr race, region, highest level of parental education, vjtype of com- 
munity, size of community and grade levels. Estimated proportions for each . 
subpopulation are listed in Appendix C. 

, ■ • y . . . • .. •■ / • 

Definitions of Reporting Groups 

• f- . ■ 

T 

The finitions of the categories used in this report for 9-, 13- and 17- 
year-olds are given below. 



Sex 

Results are presented for males and females. 



Race 

Results are presented for blacks and whites. • 



Region ' 

The country has-been divided into four regions: Northeast, Southeast* 

Central and. West. States included in each region are shown on the following 
page (see map). 



Level of Parental Education 

Three categdries.j'Of parental-education levels are defined by. National 
Assessment, based on students' reports about them.. These categories are: 
those whose\parents did. not graduate from high school.; those'who have at least 
one parent who graduated from high, school ; and those who have at least one 
parent who hais had some post high school education.^ 



^ The form of the parfental-educatidn question Was changed slightly after the 
1969-70 assessment. Details are given in Appendix C. - 

. 17: ' ' ^ . 
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Grade Level . 

Results are categorized for 9-year^olds in the" 3rd or 4th grade, 13-year- 
olds in the 7th or 8th grade and 17-year-olds in the 10th, llth or 12th grade. 



Size of Community 

Big city. Students in this group attend schools within the city limits 
of cities having a 1970 census population over 200,000. 

Fri nges ar ound big . ci ties . Students in this group attend schools' within 
metropolitan areas (1970 U.S. Bureau of the Census urbanized areas) served by 
cities having a population greater than 200,000 but outside the city limits. 

Medium city. Students in this group attend- schools in cities having a 
population^ between 25,000 and 200,000, , not classified in the fringes-around- 
big-cities category. 

■ • ■ ' ■ J' ■ ' . ■ . ' ■ ■ 

Smaller Places. Students in this group attend schools in communities 
having a' population less . than 25,000, not classified in the f rihges-around- 
big-cities category. - 



Type of Community 

• Communities in this category are defined by an occupational profile of the 
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area served by a school as well by ^[^g size O''' tfie comn^^^^^y ^^^\]ich the 
school is located. 



. Advantaged- urban (high-metr oM i t^nj^ofg^ Students this Qroup 

attend schools in or around cities wTtlTapopul atT^^ th^^^ 200,00^ where 

a. high proportion of the residents are in p^°^^^^ Tonal "^^"^9^^ial positions, 

Disadvantaged-urban (low-nij ^lopojjj^ ggil!!llijnjtjpc. Student^ in this 
group attend schools in or around cities with ^ Pop^Jf^tio" 9^-^^ter than 200, ooq 
where a high proportion of the ^^^idents^^are If are not regul^^ly 

employed. ' ' [ 

Extreme- rural communi ti es . Students in ^"''■S group attend schools ^"^^as 
with a population under 10,000 where ^Qg^ of .^'^^ i^^si dents,' ^""^ "farriers ^^^tn 
workers. 



^rou^ Results . 

The distinction between cross-sectional ^ onqitudina^ ^^i^VeV research 
designs is especially important to note in or^ler to ^nterpi^et chanqes in '"e- 
suits for groups of respondents- National Assessment does ^^eport changes 
for the same .individuals; rather, it reports changes the same types of 
groups. of respondents, such as those living Southeast or ^-^J^g attend- 

ing schools in rura^l areas. Thus, a group ""^^Pondents in_one assessment 
might have a composition of people diffgy.ent f'"0"i.-.the same Qf'oup ^^^^iped in 
the same way in another assessrn6"t. 

The longer the time between assessments, the ^^^g these Qroupg j^^ght dif^ 
fer. The Southeast, for .exianipie» "light become mor^ urbani^^^ racial ' 

composition might change, because of iiigpation ^^^w^en regi""?^*' The extreme- 
rural respondents in a)iy given year are defin^^ as ^he lO*^ our sample attends 
ing the most rural schools; schools classifi^^''^.^ Extreme ''"'"^^ one year might 
not be the most rural in the neXt assessment ''^^au^e of P°P"^^*^"on shifts, con., 
solidation of schools , .an.d so oh- Every attempt h^s ^gen m^*^? to keep the 
category definitions constant; however, we know so^i^ ^.f^gjiqes in the composition 
of these categories occurred between l95g_70 and l97g_77/ 

Group results are computed by esti mating, J niean percentage ^.Q^-rect for 
a reporting group in the same manner as described p^g^iously for -^^^ national 



^U.S. Bureau of the Census, "Mobility of thff PoP^^^tion of the Unitpd States: • 
March 1970 to March 1973," eunieJlt^Pomji a^^^ Series . P-2o no. 262. 

(Washington, D.C: U.S. Government Printing Sr^?^i974); U.S. g^Jga, of the 
Census, "Geographic Mobil ity: March 1975.^0 March ^977 " CuTZent pooulation 
Reports , Series P-20, No.. 32 (Washingt;on dX-' U.s Government^-p-p^^^^ 
Office, 197.ff); National Center for Education Statig^:^^^ JB-^Sndit on of Educa 
tion, 1977 (Washington, D.C.: U-S- Goveriliment P'"inting'o7?i^^^~l'§777'^ ^ 



mean percentage correct. The nationaV mean is then subtractecf from the group 
J!^an. to obtain the group's difiFerence from the national percentage correct 
•"^j; example, the mean percentage of correct responses for Northeastern 17-year 
°'°s 1977 was 46.5. Subtracting the national mean from the Northeastern 
mean fields a Northeas tern "17^year-old .relative performance advantage of ? 3 
percentage points.-. ^ . 

^ ^ ■' ^ ^ . . - - , ■ 

^. Differences in group percentage (relative performance) and changes in 
^nose differences from 1969-70 tO' 1972-73 and from 1972-73 to 1976-77 for the 
j"'' exercise-sfrts^-a^=e-Gonta4-ned-in Tables 3-1 through 3-3 for ages 9 13 and 
>"espectively- \ ' - 

_ ■ In this report, vve have chosen to emphasize changes in relative perform- 
ance for several reasons. Most reporting groups changed very little in rela- 
position over the course of the three assessments. That is, whatever the 

.;'J^^Tal . advantage or^ disadvantage of a reporting group, the average peroent-age' 
nL^?^'"®ct "^e^P""^^^ Changed at about the same rate as the nation for e'kchige 

.P°P'^lation. The mean difference from the nation, since' it removes the oJ^rall 
national tV-end, makes it easier to detect those reporting groups, such as ex- 
treme rural, that have undergone major shifts in position relative to the 
nax.ion. Those differences were highly stable over the three assessments as 
depicted in Figures 3-1 and .3-2. . .' ^nts, as 

_ Figure 3-1 shows the- range of group differences from the nation at each 
jge for sex, race, region and level of parental education. Figure 3-2 shows 
irh ^^^^ information for type of community, size of community and grade in 
scnool. For each age and reporting group, the dot is the weighted averaqe of 
'"^J" group diffpi^ences from 1969-77, while a line is drawn between the most 
rha ^"^^ mean group differences. When. a consistent trend exists across the 
^"3fi9e measures,- an arrowhead has been placed on the line to indicate thp 
ai'^ection of change. . , 

Across the three assessments: 

• Males maintained their advantage over females, and the gap increased 
with age from about 2 percentage points at age 9 to about 6 percent- 
age points at age 17. 

• Performance of white students. V/as' consistently higher than that of 
black students. Differences in performance ranged from 12 to 18 do in ts 
for the three age groups. ' . . , 

• Performance of students in the Northeast was consistently high while 
performance 'in the Southeast was consistently low, ranging from about 
4 percentage points below the nation at age 9 to about 2 percentage 
points below at age 17. In the Central region, performance was con- 
sistently above the nation, wh.ile students in the West performed at or 
below the nation. 

• Level of pai^ehtal" education was consistently related to achievement 
Students reporting that neither parent graduated from high school ' 



TABLE 3-1. Report! ng-Group Mean Differences in Percentage Correct From the 
Nation for 1970, 1973 ard 1977; Change' in Mean Differences From 1970 to 1973 
and From 1973 to 1977; and Standard Errors for the Total 
Change Exercise Sets at Age 9*. 



Region 

* Northeast.. 
Standard. error • 
Southeast " ... * 
Standard error 
Central 

Standard error ^ 
West 
\ Standard error 

Parental education 

Not graduated high" school 
Standard error 
' . Graduated high school 
Standard error . 
Post high school 
Standard error 

Type of conmunity 
Extreme rural 
Standard error 
Low metro 

Standard error ^ . 

High, metro'. 
Standard error 

Size of comnuhlty 
Big city 
Standard error 
Tringes around big cities 
Standard errpr 

■ Medium city • 
Standard error 
Smaller places 

. Standard error 

Gj^ade in school 
3 

■ Standard error - 
Standard error 



Mean Differences 
on 92 Exercises 



Mean Differences 



\ .' 


1970 


' 1973 


Change From . 
1970-73 ' 


1973 


1977 


Change From 
1973-77 


Sex 
Male 

Standard .error 
Female 

Standard error , • 


"M* 
(.2) 

-1.1* 
(.2) 


1.0* 
(.2) 
-1.0* 
• (.1) 


- -0.1 
. (.2) 
0.1 
(.2) 


' 1.3* 
(.2) 
• -1.3* 
(.2) 


1.3* 
(.2) 
-1.4* 
(.2) 


/ 0.0 . 
(.2) 

O.ot 

\. 

-0.3 
(.4) 
i--0.3 
'(.9) 


Race 
White 

Standard error 
Black 

Standard errpr 


3.0* 
(.2) 
-14.2*. 
• (.7) 


■ 3.0* 
(.3) 
-13.6* 
. (.6) 


0.0 . 

(.4) 

0.5 

(:9) 


2.7* 
(.2) • 
-12.5* 
(.6) 


2.4* 
(.3) 
-12.8* 
(.7) 



2.6* 
(.5) 
-5.8*. 

(.7) 
1.7* 
(.6) 
0.4 
..(.7) 



-6.9* 
(.6) 
0.5 
(.4) 
5.9* 
(.3) 



-3.7* 
(1.3) 
-15.2* 
(1.1). 
8.1* 
(•7) 



-3.6* 
(.7) 
4.1* 
(.7) 
1.2* 
(.6). 

-0.1 . 
(.4) 



-9-.0* 
(.5) 

.3.4* 
(.2) 



1.8* 

(;6) 

-4.3* 
(.9) 
1.6* 
(.8)' 
0.3 
(.8) 



-5.2* 
(.5) 
0.7* 
(.3) 
5.4* 
(.2) . 



-2.6* 
(1.0) 
-13.4* 
(.8) 
6.6* 
(.9) 



-3.5* 
(.8) 
2.8* 
(.7) 
1.3 

(1.1). 
0.0 
(.5)- 



-8.4* 
(.4) 
2.7* 
(.2) 



-0.8 


1.3* 


2.1* 


0.8 


(.8) 


(.6) 


■ (.6) . 


(.9) 


.1.5 


-3.8* 


-4.2* r 


-0.4 


(1.1) 


(.8) 


(.9) 


■ (1.1) 


-0.1 


1.6* 


1.1 


-0.5 


(1.0) 


(.7) 


(.8) 


(1.1) 


-0,1 


0.4 


0.4 


0.0 


(1.0) 


■ (■■■') Q 


(.7) 


(1.0) 



1.8* 
(.8) 
0.2 
(.5) 
-0.5 
(.4) 



1.0 
(1.6) 

1.8 
(1.4) 
-1.5- 

(i.i) 



0.1 
(1.0) 
-1.3 
(1.0) 

0.2 
(1.3) 

0.1 

(.6). 



:6 
(.6) 
-.7* 
(.2) 



-5.2.* 
(.5) 
0.7* 
(.3) 
5.2* 
(.3)- 



rZ,2* 
(.9) 
-12.0* 
. (.7) 
5.7* 
(.9) 



-3.6* 
(.7) 
2.5* 
(.7) 
2.5* 
(.9) 

-0.1 
(.5) 



-7.6* 
(.4) 
2.4* 
(v2) 



-6.4* 
(-.6) 
■1.1* 
(.3) 
4.5*- 
(.3) 



0.7 

(l.'l) 
-11.2* 
(1.3) 
.7.3* 
(.8) 



-4.6* 
(1.0) 
4.2* 
(.6) 
-0.7 
(1.4) 

o.r 

(.5) 



-6.9* 

■■(.4). 
2.3*' 
(.2.) 



yenotes differences or change's in 'differences greater than or equal tc two standard errors. 
tAII computations were performed -prior to rounding .to one decimal place. 
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•:1.2 

(.8) 
0.4 
(.4) 
-0.7 
(•4) 



2.9* 
(1.4) 

0.8 
(1.5) 

1.6 
(1.2) 



-1.0 . 
,(1.2) 

1.7 ■ 

(.9^ 
-3.2* 
:(1.6) 

0.2 

(.7) 



.. .7 
.(,6) 
-■ .2 
(.2) 
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latfon f^l ReP^. 1972 ann^P Mean Differences in Percentage Correct From the 
I9^?^nm 1972 J 1976; Change in Mean Differences From 1969 to 1972 
' . ^nd r'^ pfo 1976; and .Standard Errors for the Total 

^^nge Exercise Sets, at Age 13 - ' 



1969 



Mean Differences 
' on_67Exercises 
1972^ 



Mean Differences 
on 75 Exercises 



Male 
Standard 
pemale 
Standard ^ 

Standard 
5lack V 
Standard 

Region. 

. f^ortneast 

Standard . . 

Southeast^^Q^ 

Stanaard 

central 

Standard ^ 

nest ^^^^^ , 
. stan°3rd 

..1 

ParenW edt,^^ 
not 9radu5^;^tion ^ .chool 
Standard .J^d 

post high ^^^^^ . 

Standard ^^Qh^Q^r 

. X ^r 
Type conii^ 

St^andard ^^^l'^ 
LOW metro '^ir 
Standard ^ 

s Hi9*i '"'^^tro ^r 
Standard ^ 

Size . 

Big. ^^ty 
^•Standc^-ci ^ 

fringes ar^n^y, . -jtles 
Standard ^ig ^ 

■Med^"^ cit^Ni^ . 

SiTiali^r pl^Ni^ 
Standard ^^^^^ 

grade ■'n scho^.^ ' V 
Stanciav^d 

8 ' 
Standard 



2.0* 

-1.9* 
(.2) 



3.1* 
(.3) 
-15.2* 



' 2.1* 
(.6) 

-4 4* 
(.9) 
2.2* 

. (-6): 

-0.2 
(.6)- 



-7.4* 
(.5) 

-1.3* 
(-.3) 
6.0* 
(.3) 



-4.3* 
(1.2)- 

-11.9* 
(1.1) 
6.4* 

.. (:8)' 



2.1* 
(:2) 
-2.1* 
(.2) 



3.3* 
(.3) 
-16.6* 
(.6) 



2.0* 
(.8) 

-3.2* 
, (.8) 
1.8* 
(.8) 

-0.8 
(.8) 



-7.1* 

(.5) 
-0.3 
. (.3) 
6.3* 
(.3) 



ahangea 




Change From 
1969-72 


1972 


1976 . 


Change From 
1972-76 


O.I 
(.3) 
-0.2. 
;(.3) 


1.8* 

■ (.2) 
- -1.8* 
(.2) 


2.3* 
(.2) 
-2.2*'- 
(.2) 


0.5* 
(.2) ' 
-0.4* 
(.-2) 


0.3t ■' 
(.4) 
-1.4 

(.8) ; 


2.7* 
- (.3) 
-13.4* 
. (.5) ' 


2.6* 
-(.3) 
-11.8*- 
(1.0) 


-0.1 : 
(•4)- 

1.6 
(1.1): 


-0.1 
(1.0) 

-0.4 ■ 
(1.0) . 
-0.5 
(1.0) 


1.5* 

(.6) 
. -2.7* 

(.7) 
,. 1.5* 

(.7) 
-0.4 

(.7) 


2.1* 

-2.7* 
(.6) 
1.6* 
(.7) 

-1.4* 
(.7) . 


0.6 - 
(1.0) 
0.0 

(.9) 
0.1 

(1.0) 

-1.1 

(1.0) 


. '0.4 
(.7) 

1.1* . 
(.4) 
0.3 
(.4) 


-5.9* 
(.5) 

-0.2 
(.3) 

. 5.2* 
(.2) 


-6.2*- 
(.6) 

-0.6* 
(.3), 
4.9* 
(.2) 


-.0.3 
. (.7) 
-0.5 
(.4) 
-0.2 
(.3) 


2.3 
(1.7) 
-1.2 
(1.7) 

0.4 
(1.0) 


-1.9 
(1.1) 
-10.7* 
(1.2) 
5.4* 
,,■•.(.6) 


-0.4- 
(.9) 
-11.6* 
(1..4) 
5.6* 
'(.6j 


. 1.6 
(1.4) 
-0.8 
. (1.8) 
0.1 
(.8) 


-0.3 

(1.2) 

-1.1 

(.9). 
-0.5 
(1.6) - 

0.7 

^ (.7) ■■ 


-3.1* . 

(.8) 
■ 1.5* f 
. ':.(.6) 
1 0.1 

, ,(1.1) 
■ f 0.6 


-3.2* 
(1.0). - 

2.5* 
(1.0) 
-0.1 
(1.0) 

0.2. 

(.4) 


-0.1 - 
(1.3) 
1.0 
(1.2) 
-0.2 
(1.5) 
. -0.4 
(.6) 


- .1 

(.7) ,■ 
.1 

.<.3)- 


■ y'-s.6* 

\ (.4) 

- 2.5*' :■■ 
! (.2); 


-6.0* 
(.3) . 
2.4*- ■ 
(.2) 


- .4 " 

(.5) ■ 
. .1 

(.2) 



M ^^^-^"^^U^^nc^s , performed p^-^fferenoeB greatei. 



than, or equal to 
one decu^cLl place. 



tuo standard errors* 



TABLE 3-3. Reporting-Group Mean Differences in Percentage ^l^^^^t F^^"^ 
Nation for 1969, 1973 and 1977; Changes in Mean Differences ^^1% ig69 
and From 1973 to 1977; and Standard Errors for th^ ^^^t^j"^ 
Change Exercise Sets at Age 17 



Sex 
Male 

Standard errot 
Female 

Standard errpt 

Race - • 
White 

Standard error . 
Black . 
Standard err(?> 

Region 

. ■ Northeast 

Standard errpY - 

Southeast 

Standard errpr 

Central 

Standard errc7^ 
West 

Standard erra*' 

Parental education 
Not graduated high school 
Standard errc;^ 
. Graduated. high school 
Standard erro^ 
Post high school 
Standard errc;^ 

Type of; community 
Extreme rural 
Standard erro^^ 
Low metro 
Standard erro^^ 
High itietro 
. Standard erro^ 

Size of community 
Big city 
Standard erroJ^ 
.Fringes aroun^l hig cities 
Standard erro^^ 
Medium city 
Standard error 
Smaller places 
Standard error 

Grade in school 
.10 
Standard error 

11 - ... • 
Standard error 

12 . 

Standard error 



• Mean biff ef^ences 
oh 64 ExeiyiSs • 
1969 ^~mr^^^ From 



3.0* 

-2.9* 
(.2) 



1.6* 
(.2) 
-IM* 
(.7) 



1.9* 
(.6) 
-3.2* 
(.6) 
0.3 
(.5) 
0.2 ^ 
(.5) 



-5.7* 
(.4) 

-1.1* 
(.3) 
4.2* 
(.2) 



-2.9* 

(1.0). 
-5.1*. 
(1.1) 
■ 5.9*- 

(J) 



.1.8* 

(.8) 
2.1* 
(.7) 
0.7 
(.8). 
-0.5 
(.4) 



5* 



-7 

1... 



.5) 



( 
3 

(.4) 



.1" 



2.8* 
(.2) 
-2.7* 
(.2) 



1.9-^ 

-10.4* 
(.4) 



1.5* 

(.5) 
-1.6* 

(.6) 
. 0.6 

(.6) 
-1,1* 



-6.3* 

(.4) 
-1.4* 
(.3)' 
4.2* 
(.2) 



-1.4* 
(.8) 
-7.3* 
(1.1) 
• 4.4*. 
(..8) 



-3.3* 
(.8) 
1.3 
(.8)- 

-0.1 
(.8) 
0.5 
(.4) 



-7.4* 
(.6) 
1.1* 
(.1) 
2.6* 
(.4) 



-0.2- 
(.3) 

(.3) 



(.3) 
(.8) 



2 
8) 
6 



(.8) 



-0.6 
(.6) 

(.4) 

(.3) 



72.3' 
^1.0) 



;o.8 

(1.0) 
(1.2) 
(.6) 



Cs) 

"(.1). 



1973 




2.2* 
(.2) 
■12.6* 
(.5) 



1.0 
(.6) 

-2.1* 
(-7) 
1.0 
(.6) 

-0.4 
(.6) 



-6.6* 
(.4) 

-1.7* 
(.3) 
4.7* 
(.2) 



-0.8 
(.8) 
-8.1* 
(1.0) 
4.7* 
(.8) 



-3.6* 
(1.0) 
1.1 
(.8) 
-0.1 
(.8) 
0.8* 
(•4) 



-7 
( 

1.2* 



8* 

5) 



.2) 
.4* 
.4) 



•2^ 



■2, 



3a. 



2a. 



•9 A 



0^ 



(-2) 



(. 
-10. 



Si 



(1. 



4*. 
•3) 



'hi 

(•S) 



'7.8. 

hi 



(. 



4) 



. *DeKdtea diffefenaes changes -in diff^^^^ea greatee i^'op eaual i^o stani"^'^ ^*>>c 
t All, computation^ wefB pepfo^^^^ prior to Pouncing to one decitral P^«^' 



(1>^ 



(1 



.6) 
.3 
6) 
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FIGURE 3-2. 'Type-of-Qommunity, Size-of-Community and Grade-in-School 
Reporting-Group Mean Differences in Percentage Correct From 
the Nation in Three" Assessments , Ages 9, 13 and 17 
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consistently achieved 5 to 7 percentage points below the nation, while 
. those reporting at least one parent with post high school education 
• . scored 4 to 6 points' above the nation. 

• Students in disadvantaged-urban (low-metro) communities performed 5 to 
15 percentage points below the 'nation, and 17-year-olds' performance 
was generally closer to the nation than, the other ages. Students from 
advantaged-urban (high-metro) communities performed about 4 to 8 points 
above the nation; • 17-year-olds,. again, were" closest to the nation. 
Students in extreme- rural areas, moved from well below the nation in 
1369-70 to the. national level in 1976-77.. 

» Students in big cities consistently performed below the nation, while 
students, in fringes around big cities performed above the nation. Per- 
formance, of students in medium cities and smal ler -places tended to be 
' at or near the national level, although the performance of 9-year-olds 
in medium cities was somewhat erratic. 

• Students one grade below the modal grades for their age (grades 3, 7 
. and 10). consistently performed 5 to 9 percentage points below the 

nation," while those in the. modal grades (grades 4, .8 and 11) performed 
2 to 3 points, above at ages 9 and 13. Seventeen-year-olds in the 11th 
grade were about 1 point and 12th graders 2 to 3 points above the 
nation. 

_ • The highly consistent performance patterns -shown, in Tables 3-1 to 3".3 and 
Figures, 3-1 and 3-2 and the general lack of change in performance relative to 
the nation occurred through all of.the science summaries. For that reason, 
group results for type of science, 1972-73 objectives and the exercises used 
in all three assessments have not been reproduced in this report. " 



CHAPTER 4 
THE. ADULT SCIENCE ASSESSMENTS 



The science assessments of young adults, ages 26-35, were similar in many 
respects to those o f 9- , 13- and 17-year-olds. Conceptually, National Assess- 
ment extended its coveracis of the American population at three stages of educa- 
tion (late primary, midd'^e school and high school) to an age group where most 
members had completed their formal schooling. The same objectives were used,- 
and the exercises, while originally written for 13- or 17-year^olds, were also 
appropriate for young adults. Nationally representative probability samples 
of all age groups were assessed. A school sample was used for the, three 
sctiool-age populations, while' a household sample was used for young adults. 
Details of sampling and data collection for the two types of surveys are suf- 
ficiently, different to merit a separate discussion of the young adult assess- 
ments. 



The 1969 Assessment of Young Adults 

In the summer of 1969/ young adul ts born between July 1933 and June 1943 
were assessed using^.the^sanie primary sampling units (PSUs) used for the iri- 
school 17-year-old assessment. ^ It was the first large-scale attempt to col- - 
lect achievement data in a household survey, and respondents .were not paid to 
participate in the assessment. Seventy-seven percent of the sample households 
were successfully screened to see if any age-eligible persons lived there, 
while only 57% of the age eligibles who were located agreed to participate, 
yielding a 44% response rate. Achievement data were reported for young adults 
in science, citizenship and writing. However, response rates^in the 1972-73 
and 19*77 young, adult assessments were so much higher that changes in achieve- . 
merit from the 1969 adult assessment have not been reported. The remainder of 
this chapter is devoted to the second and third adult science assessments. 

The. 1972-73 and 1977 Assessments of Young Adults 

The second and third adult science assessments were similar in large part. 
Their common features are described first, followed by brief summaries of 



^The 1969-70 assessment is described briefly in Appendix .A and more fully in 
1969-1970 Science: National Results and Illustrations of Group Comparisons , 
Report 1, .1969-70 Assessment (Denver, Colo,^ Education Commission of the States, 
National Assessment of Educational Progress, 1970). 
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i/IliS"^.'*'^^!"'"^^ and summaries of changes in young adults' achievement between 
• 1972-73 and 1977. . 

• Both assessments were conducted by experienced household-survey staffs. 
Great emphasis was placed on training, supervision and verification of field 
. work. Age-eligible adu'ts were paid $5 per package for up to four packages- of 
, assessment exercises. As a result, nearly all sample households were success- 
fijl.ly screened to locate age-eligible adults, and 79 to 84% of eligibles par- 
ticipated in the two assessments. ■ ., 

Sample Design 

Deeply-stratified, multistage probability sample designs were used in both 
assessments. StratJtfication variables included geographic region as well as 
measures .of community sue, and urban-rural and socioeconomic-status variables ^ 
•Il"^r^-,^^'"P^^*"9 were made up of counties or groups of contiguous counties 

with 1970 census populations of at least 20,000 persons, in states that have 
no county definition (such as Alaska and some New England states), PSUs were 
defined from comparable census or political units. Table 4-1 contains the 
number of PSUs in each assessment. 

Within each sample PSU, smaller secondary or tertiary units, or-segments, 
were defined and -sampled. Segments are small, well-identified land areas con- 
taining an average of 1^ housing units iri 1972-73 and 26 housing units in 
1977. They can range m size from one side of one block in a large city to 
most of a county in a rural area (number of segments used is shown in Table 
4-1). Within each sample segment, all -housing units were listed. A sample 
(sometimes. 100%)' of housing units was then screened for eligible Adults.'* 



The designs for 1972^73 and 1977 differed somewhat in the ways these variables 
were^defined and the sampling stage at which stratification' was introduced. 
In 1977, recent screening data allowed stratification of the household by race- 
and age eligibility. Detailed documentation of the design used from 1970-7L 
to 1972-73 is contained in,R. Moore et al.. The National Assessment Approach' to 
Samglina (Denver, Colo.: Education Commissio'n of the States, National Assessment 
of Educational Progress, ^1974) . Similar documentation of the 1977 sample design 
IS contained in C. Benrud et^al., FinaT Report on National Assessme nt of Educa- 

tional P rogress SampJ in Qnd Weighting Activities for Assessment Yea r 08 ~ 

(Raleigh, N.C.: Research Triangle Institute, February 1978) . 

^ensus-defined, institutions and group guarters were excluded from the defini-. 
tion of housing units. . 

^Adults outside the defined birth-date .ange, those with language barriers and 
those too functionally handicapped to respond"" to assessment materials were 
excluded. Self-identified nonreaders Were also exfcluded. About 4% of age 
eligibles were excluded tor those reasons, as shown in Table 4-2. 
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_ TABLE 4-1. Characteristics of the 1972-73 and 1977 Young Adult Assessments 

' .. ' ■ " M Assessment 1977 Assessment 

Subject areas ■ Science and mathematics „ Science, energy, health and reading 

:Nute of packages- 8, both areas in each pacfage, , 4, one area per package 

[Type of .exercises 6 packages: multiple-choice and Hultiple-choice ' ' 
I . . : ' open-ended exercises 

! ' 2;packages: interview/performance • 

iAudiotspes ■ . ■ . , 6 packages: self-paced ' Science change exercises: self-paced' 

I . 2 packages: none , All other exercises: none 

jlncentive $5,00 per package , |5J per package , 

jOata collection period October 1972 to Hay 1973 , Hay 1977 to July 1977 ' ' 

Birth-date range ■ January 1, 1937, to December 31, 1946 January 1, 1941, to December 31,, 1950 

I Number of primary ' .■ ^ 

j sampling units 106 / 58 

iNumber of segments 1,059 ' 429 



^Field interviewers made multiple visits, if necessary, to housing units 
to obtain screening information from occupants or neighbors. Occupants who 
refused to supply screening data were called or visited by supervisory staff 
to solicit cooperation. 



Data Col lection . . 

When eligible adults were located, they were asked to fill out a background 
questionnaire and complete up to four packages of assessment exercises admin- 
istered in random order." If they agreed to participate, . they were paid $5 for 
each package completed. Each package was designed to take about 45 minutes 
The average number of packages compl eted 'per respondent was 3.83' in the second 
assessment and 3.74 in the third. Response ra,tes Were 84% and 79% in the two 
assessments, as shown in Table 4-2. Field work was continuously monitored by 
a combination of mail, telephone and personal follow ups with respondents to - 
verify that they had been assessed and that procedures had been properly fol- 
lowed, r r J 

Scoring 

Sets of background questionnaires and'packages were audited for complete- 
ness and consistency and scored by the Measurement Research Center in Iowa 
City, Iowa. Sampling weights were computed and adjusted for nonresponse 



Differences Between the 1972-73 and 1977 Assessments 
Sample Design ' 

The 1972^-73 sample was designed specifically for National Assessment, with 
large enough samples (about 2,100 per package) to allow reporting by standard 
assessment reporting categories (region, sex, race, parental education, and 
size and type of conmunity). 

The 1977 assessment of young adults used a half-sample of Research Tri- 
angle Institute's National General Purpose Sample. Because. the sample was not 
specifically deSjigned for National Assessment, stratification -was not strictly 
„optimal for NAEP's purposes. For example, census regional definitions wev-e 
used to stratify the sample rather than the Office of Business Economics' - 
regional definitions used by National Assessment. Also, socioeconomic indexes 
were used to stratify the sample, but low socioeconomic areas were not over- 
sampled, as was done in 1972-73. The sample included 58 PSUs and about 1,300 
respondents per package, which is notlarge enough for all National Assessment 
reporting categories. It was possible to oversample blacks and report results 
byrace, but communfty-size categories were collapsed from four to two and 
neither type-bf- community nor regional results are reported. 



30 



TIE 4-t Scree^i^g mi tepo^se-liatp ^ata for the 1972-73 
. , ■*1977 Assessments of ki^yylts 



piier and percent screened 
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Learning-Area .Mix ■ ' . ^ : 

Ihe 1972-73 assessment comprised pac|59«s, ^^^^ jcience 0. 
latics exercisfe in each package, • Many of the i^^'^J^ «ere "P^^^enjed, ai«l 
two packages contained interview afl^ ^%f0l ^S, Be l^'^ ^ssgcint ■ 
contained, four packages, one each of scie^^^^ energy ^'^^^eaitb if'- 
cises,.^ 



laping 

' ■ In 1972-73, self-paced audiotapes t»ere"use^'^" ^^minist^^^'^^ %c\i^i' 
The field interviewer turned on the ^^pei^^^^jef whii^ the ^^^P°"^etit rea^ 
exercise; the recorder was then tufted, o.ff ^„til ^escon^^"'^^^ t^wdy to- . 
.go on to the next exercise, ]^, the ,1377 asjessi^"*' the saiie P%1 Jas ■ 
used for science exercises repeated fi'oiig/^.ll. \^ gnercises occurred i" 
the first half of the science packaSe.' AudiotaP^^ ''erg .^t use^ . . 0i 
exercises. In the 1972-73 assessment., % tapes all e)(e^ L ex- 
cept the interview and performance i**, w^jie '"^ ^^1) the| ^^^^ uspj for 

only d ''""''^^ nftM4'^Aw /,£ Hm. 1.1*1 iflc+i'h^ I J . ^ . . 



lime of Testing' '' 

Ihe 1972-73 assessment was con^"cte^ j„d May, the 

1977 assessment was conducted between May Jul/' Al^^jt ^% adults 
between the ages of 26 and 35 have Mp'eted thei^ for^j| gjycation v 
is ilttie reason to believe that ti^^ of testinS ^^^ sq ^ycb ef^'t their, 
performance as it'has on'that of .sc^°°lige 00^%: - 

. It is also possible that there a^e so^e dif(^''^"C5s j„ the cha^^ . ^^tics 

of nonrespondents between fall and sPn'ng, j^dietween ] jjg,5pring ^ .^|y. 

suier survey periods.' The shorter survey in ,197^ g pake ^.^if- 
ficult'to locate and assess.eligibies who #y fro^ 
periods, ■ ■ ■• ■ 



Released Exercises ■ ■ • 

As explained in Appendix c, National '^^ssess^^^t cautioyj .^^^ , 
using released- exercises to measure c^anSes^in t^e ^ctilewe^t of sc^^ , jge 
populations, Ihere are many ways students to that ■ 

we have released for public use, «e are, ^^^^^^1, ki^^^^n^i aboy. reusi"5 
.released exercises with young adultS' ' For ajylts. we ^j^g ^t identjf an|' 



'A short questionnaire was also administere,] for-f's k^^ jud Dr^S ^j.jstra-. 
tion after all other packages .had be^n comp]'gt«d. ^inet.,pgrcei't of k. respon- 
dents agreed to complete this questionnaire ■ 
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plausible analogs to the ways students can be systematically exposed to re- 
leased exercises. Adults are not likely to be exposed to specific exer'cises 
unless they are teachers or res^earchers directly involved in the subject area — 
^ ^^*^y small proposition of the population. 

ConsequentlYv 3 number of previously released exercises were included in 
the 1977 adult assessment to medsure changes in science achievement, Pq^j^ Qf 
the 2p change exercises had been released after either the 1969-70 or 1972-73 
assessments. . 'J 

National Results * ' . 

Mean percentages of correct responses for each assessment and changes in 
percentages of correct responses for young adults between 1972-73 and I977 are 
given in Table'4-3* The percentage of correct responses decreased 4 percent- 
age points between the two assessments on the 20 exercises available to measure 
changes ip achievement. Fifteen of the exercises were also administered to 17- 
year-oTds enrolled in school during the two science assessments. The percent- 
ages of correct responses were similar for .17-year-olds and adults in the two 
assessments, and both age groups' percentage correct decreased between 1972-73 
and 1977. . 

•^ean percentages "of correct response and changes . for both released and 
unreleased exercises are also shown in Table 4-3. Decreases for both sets were 
about 4 percentage points, the same as for the entire set of exercises. 



Group Results for Young Adults 

'^I'l'ferences in "lean percentages of correct responses between reporting 
groups and the nation in 1972-73 and 1977 and changes in those differences are 
displayed in Table. 4-4. Table 4-4 also contains estimated population prppor- 
tions f oy^ egch reporting gf'oUp and.;estimatecl standard errors for all mean dif- 
ferences and changes .in mean differences. Adult reportihg-group definitions 
are. icl.ervtical to those given in Chapter. 3, with the following exceptions: 

' Community-size categories were collapsed. ' Big cities and fringes around 
big cities were combined, as were medium cities and smaller places. 

. * Young adults' own education. is reported, using the. same category defi- 
: ^nitions as parental education for students in school^ 

V The age range 26-35 was divided into rages 26-30 and 31-35.. 



Were 



Unlike tbe >^esults for 9-, 13- and 17-year-olds, young adults' weights 
rK3t smoothed prior to estimating proportions and percentages of correct 
|^^P°"ses. Data were available for only the assessment years 1970-71 through 
19/3-74 and 1977— too fey/ points for effective weight smoothing. , • 
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j;iciK Impact ^ or^ v -^^,.^g 

What between An ^^^^^"1^^ f &^JV^^t^''2n1^ 2en^h^'^''rf^Sn 

is. given beloW>>Mts. A^^ 6^ KA, of J^: f 5^1" if^ ^pP^V^^' 

1972-73 and 1^' -. ^cedilre^' , ^1°" , \ " H I?'' • . 



stage, the United States is divided into geographical units of counties or 
groups of contiguous counties meeting a minimum size requirement. These units, 
called primary sampling, units (PSUs), are 'stratified by region and size of com- 
munity. From the list. of PSUs, a sample of PSUs is drawn without replacement 
with probability proportional to population size measures, representing all 
regions and sizes of communities. Oversampling of low-income and extreme- rural 
^ areas is performed at this stage by adjusting the estimated population size 
measures of these areas to increase sampling rates. \ 

In the. second stage, all public and private schools within each PSU 
selected in the first stage are listed. Schools within each PSU elre selected 
without replacement with probabilities proportional to the number Sf age elig- 
.' ibles in the school. * . 

• \ The third stage of sampling occurs during the data collection period. A 
list of all ,age-eligible students within each selected school is made. A 
simple random selection of eligible students, without replacement, is obtained, 
and item booklets are administered to selected students. Specially trained 
field personnel select the sampl e and- administer the booklets. In each assess- 
ment, 13-year-olds are assessed in the months of October, November and Decem- 
ber; 9-year-olds in January and February;' and in-school 17-year-olds' in March 
- and April. 

When funding levels permit, the sample of in-schoo.l 17-year-olds is sup- 
plemented with a sample of ou.t-of-school '17-year-olds. Between 1969-70 and 
. 1972-73, out-of-school IZ-year-olds were assessed as part of the household 
safnple of young adults. The out-of-school 17-year-old population is relatively 
small and expensive to locate through a household sample. Starting in 1970-71, 
the household sample was augmented by a supplementary sample selected from 
lists of dropouts and early graduates obtained from the schools at the time of 
the regular assessment. From 1973-74 on, only the supplementary sample has 
been used.to assess ouir-of-school 17-year-olds. The household sample was . 
dropped because it afforded only slightly better population coverage while 
costing much more than the supplementary one. . 1 - 

"■ ;'In 1976-77, funding limitations precluded any assessment of out-of-school • 
17-year-olds. In order to make the 17-year-old populations comparable for all 
three science assessments,' resul ts are given for 17-year-olds enrolled iji 
public or private schools during each assessment. Resul ts for out-of-school 
17-year-olds are not included in'thi's report. • =. 

Each respondent in the sample does not have the same probability of selec- 
tion because some subpopulations are oversampled,. and adjustmrents are made to 
compensate 'for some schools' refusal to participate and for student nonresponse. 
The selection probability of each individual is computed, and its reciprocal 
Is used to weight each response in any statistical calculation to compensate 
for unequal rates of sampl ing and to ensure proper representation in the popu- 
lation structure. ■■ 

■ ■ ■ ' ■ ■ ~ " 

The number of PSUs, schools within PSUs and students within schools are 
determined by optimum sampling principles. That is, a sample design is selected 
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that will minimize costs while achieving a desired level of precision. 



- - ..Table A-1 displays the number of PSUs and schools within PSUs selected 
in 1969-70,^1972-73 and 1976-77 by age. 



TApLE A-1. Number of PSUs and Schools Within PSUs 
;v Selected in 1969-70, 1972-73 and 1976-77 



. ; 1969-70 Assessment 1972'-73 Assessment 1976-77 Assessment 
No. of No. of No. of No. of No. of No. of 
PSUs Schools PSUs Schools PSUs Schools, 

Age 9 ' 204 935 116 971 75 . 451 . 

jAge 13, 205- ^ 749 116 . . 979 75 472 

Age 17' 193. 670 . 116 . .798 75 428 

Differences in Sample Design; 
1969-70, 1972-73 and 1976-77 v\ 

The- 1976-77 s'ampTe was drawn according to the following procedures. Two 
types of PSUs were identified: (1) large-size population areas defined by the 
U.S. Bureau- of the Census as Standard Metropolitan Statistical Areas (SMSAs) 
and (2) other contiguous. non-SMSA counties grouped togetlier to meet certairr 
. minimum-size requirements. The first^ stratification of PSUs was by geographic 
region, as defined by the Office of Business Economics, U.S. Departme.ht of 
Commerce (see. Chapter 3). . ' , . . 

\ Wnthin regions, PSUs were classified into five size-of-community (SOC) 
- categories: 

SOC 1 / PSUs corresponding to the. 13 largest. SMSAs' after .adjusting the 
' ' pi5pulation size to compensate for oversampling loW-income 
. metropolitan areas. These PSUs have selectipn probabilities 
* so large that .under our al location procedures they are certain 
to be included in our sample each year. These PSUs are 
designated as self- representing . - 

" '1 
SOC 2 PSUs corresponding to the remaining 57 SMSAs with over 500,000 
population. 

• • r ■ 7 . ' •.. • ■ ■ ,, ■•. • ■ ' ■ 

SOC 3" V?vPSUs corresponding to the remain;Lng 162 SMSAs. . \ 

SOCs 4,/^5 PSUs made up of non-SMSA counties. SOCs 4 and 5. are determined 
■ y ■ so that half of the remaining population (after adjustment for 
oversampling of rural areas) fal Is -into each .category. SOC 4 
. contains PSUs in which less than 60% of the residents are 
classified as rural. ' 

Since the self-representfng PSUs are incl-uded in the Sample every year, 
they actual iv represent an additional level of stratification,; making an 
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fnn%iiir 5- °l y.^^^J size-of-communTty strata.' Each sel f-represent- 
.ing^-SMSAwas divided further into geographical substrata or nonoverlapping 
replicates chat constituted multiples of convenient work units for item admin- 
istration. These multiple work units were irtcluded with the rest of the non- 

wpil'cfr'ff *° ^T^^^^ P°°^ ''^'^^ first-stage sampling uSiSs 
?hb cf^i?- • J° ensure adequate representation. National Assessment doubled 
the sampling rate of low-income and rural areas. 

In 1975-76, first-sfaae units were selected simultaneously for four con- 
secutive assessment years (1975-76 through 1978-79), as were sdhools in the 
self- representing PSUs The present sample design requires that every four 
^nJ at least once in every ^state and not more than once in 

5° :u ^^/''^ ^""^ laOl primary sampling units in the primary sampling 
Irselec^ed each'^ear' ^ ' Z'"'" ""^'"^ 75 first-stage sampling units 

^Within the primary strata, public and private schools were listed and 
iZii^'' ^^^f^f^^d estim'ated number of youngsters eligible at each age. 

Small schools were clustered until they were large enough to respond to the 
same number of packages as the larger schools in a stratum. Schools or school 
clusters were selected without replacement with probability proportional to 
the number of age eligibles in the school or cluster of schools. Once schools 
were Identified, districts were contacted to check for changes in g?ade range 
existence of new schools. This information was used to revise < 
probabilities of schools' selection, 

. In the third stage, students were selected with equal probability and 
without replacement within each sampled school. The number of students select- 
ed was proportional to the number of age eligibles, with oversampling in low- 
income and rural areas. oamHimy m luw 

During^data collection, allowing for variable group sizes for each 'pack- 
age administration within. schools enabled National Assessment to obtain desired 
sample sizes in schools having characteristically low response rates. This 
feature^ also permitted last-minute modifications and adjustments to selection 
probabilities necessitated by enr.o.llment changes. 

those^HLd'fn^Hp iS^^'tt'" P^?-70 and 1972-73 differed somewhat from 

those used in the 1976-77 assessment.^ First, size measures for SMSAs, counties 



^For details on the 1969-70, 1972-73 and 1976-77 sample design and data collec- 
•J L«^'°n^''T?; respectively, see C. Benrud et al.. Final Report on Natinnaf 
As sessment of Educational Progress Sampl inq and Meigh tinq Activities for As';p. ;.;- 
iptYg|nJ8 (Research iriangie Park, Il.I Researcl; Inangle n t tute, 97 ; 
J.^Chromy and D. Horvitz, "Structureof Sampling and Weighting," 1969-1970 

^e^"^ts Illustrations of Group Comparison^T ^^ij^i 
iyby-/0 Assessment ^Denver, Colo.: Education Commission of the States, Nat onal 
Assessment of Educational Progress, ~ 1970) ; R. Moore et al . , The National Assess- 
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and urban areas in 1976-77 were based on 1970 census data, while those in 
1969-70 were based on 1960 census data. Size measures in 1971-72 were based 
on 1960 census data and first-count data from the 1970 census. 

Another difference occurred in the PSU sample design. In 1969-70, PSUs 
were stratified by region, size of community, a measure of socioeconomic status 
(SES; and geographic proximity. There was no requirement that all states be ■ 
included in the sample. In 1972-73, the.PSUs were stratified by region, size 
of co'nmumty ^nd SES. J.-n addition, the sample was constrained to include all 
states, The sampling of PSUs in 1972-73 was accomplished by using a controlled 
selection procedure. 2 In 1976-77, PSUs were stratified by region and size of 
community, with the constraint that each state must appear in the sample once 
every four years and controlled selection of PSUs be abandoned. 

, The size-of- community (SOC) stratifications in 1969-70 and 1972-73 were ' 
similar 'to each other but different from those of 1976-77. There were only 
four SOC stratifications in the first assessment of science. The first SOC 
category in 1969-70 and 1972-73 consisted of all central cities with overall 
population greater than .180,000. The second SOC category consisted, of the 
remainder of the SMSA containing the central city in SOC 1. The SOC 3 cate- 
gory in 1969-70 consisted of the remainder of the SMSAs and all counties not- 
cnr^^5^i ^' containing at>least one -city with a population over 15,000. 

•SOC 3 for 1972-73 was similar, except that. the miniiiiUm population of the city 
was 25,000. In both 1969-70 and 1972-73, the SOC 4 category consisted of all 
the remaining counties not in SOCs 1, 2 or 3. : 

In 1976- 77,. oversampling of low-income metropolitan areas and extreme- 
rural areas was accompTished at the primary stage by increasing the estimated 
population si7e measures of PSUs containing these areas and then sampling with 
probabi , ities proportional to these adjusted size measures. In 1969-70 and 
1972-73, a poverty index was used to stratify PSUs into high- and low-SES 
stratifications. The s^mipling rates withiii these strata were then increased 
in order to achieve the desired oversampl ing. 

Irr the 1976-77 assessment, packages of exercises were administered in 
schools to groups of students varying in size depending on an estimate of the 
rate of nooi^espcnse for that school. The administration session sizes were 
?l^o"fo *° ^^''^ ^° to, about 35 students at each age. In 1969-70 and 
1972-73, the planned session sizes were fixed at 12 students fft each age. 



ment Appro ach to Sampling (Denver, Ct)lo.: Education Commission of the States, 
Nationa-l- Assessment of Educational Progress, 1974). 

^R.. Moore et al.. The National Assessment Approach to Samplinci . 
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Estimation of Standard Errors 



Several measures of achievement that National Assessment uses in its re- 
ports were described in Chapter 1. The sample designs described in the previ- 
ous section are. complex, deeply stratified, multistage probability sample de- 
signs. A reasonably good approximation of standard error estimates of these 
achievement measures can ie obtained by applying the jackknife procedure to 
first-stage sampling units within strata, using the method of successive dif- 
ferences and accumulating across strata. 

In this section the measures of acnievement are first defined in algebraic 
form, followed by a description of the jackknife method used by National Assess- 
ment to estimate their standard errors. 



Measures of Achievement . 

Based on the sample design, a weight is assigned to every individual who 
^ responds to. an exercise administered in an assessment. The weight is the re- 
ciprocal of the probability, of selecting a particular individual to take a 
particular exercise. Since the probabilities of selection are based on an esti- 
mated number of people in the target age population, the weight for an indi- 
vidual estimates the number of similar people that that individual represents 
. in the age population. 

A sum of the weights for all individuals, at an age level responding to an 
exercise is an estimate of the total number of people in that age population. 
A sum of. weights for all individuals at an age responding correctly to an exer- 
cise is an estimate of the number of. people who would be able to respond cor- 
rectly in the age population, if the entire population were assessed. These 
concepts- also apply to any reporting group (e.g.^ defined by region, sex,, 
race, etc.) and category of response (e.g., correct, incorrect and "I don't 
know). 



• ihk ~ ^""^ °f weights for respondents to exercise e who. are in. report- 
ing subgroup i who are in the kth PSU of the hth sampling 
stratum. 

e i ' 

^ihk ^""^ °^ weights for respondents to^ exercise e who are in sub- 
group i, who are in the kth PSU of stratum h and who selected 
response category j (e.g., correct response) for the exercise. 

Note that wf. . = ? C?i. . ' . 

— . , ihk J ihk 

Then, summing k over the n^^, sample PSUs in stratum h, and summing over the .H 
e " "hp 

sampling strata, W.^ = E . w!j^^ estimates the number of eligibles in the 
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population who are in ""subgroup i. ^ 

. . H n. 

Similarly, C-i. = Z z" cfj. estimates the number of eligibles in the 
. h=l k=l ^""^ ^ ^ • 

population who are in subgroup i and who would select response category j for 

exercise e: . . . 

(1) P^"^" = c^J / 

.\ J i++ • 

/ ■' . ■ • ■ ' 

In the special case where the percentage of all age eligibles who- would 
select response category j on exercise e is estimated, the index A (for All) 
will be used in place of i as follows: 

(2) P^^ = r^^ / 

In National Assessment reports, the proportion in (1) multiplied by 100 
is called the group percentage, and the proportion in (2) multiplied by 100 is 
called the national percentage. 'The difference between the proportion in sub- 
group i who would select categov-y j on exercise e and the proportion in the 
nation is denoted by: 

(3) APf = Pf - P^j. ^ 

National Assessment also reports the arithmetic, mean of the percentage of 
correct responses over sets of exercises corresponding to the measures in (1), 
(2) and (3). These means are'taken over the set of al 1 . exercises or a subset 
of exercises, classified by a reporting topic or content objective. The mean 
percentage of correct responses taken over m. exercises in some set of exer- 
cises corresponding to measures (1), (2) and (3) are, respectively: 

. .(6) AP.-= p^-p;^. . ; ■ ^ . 

Note that the response category subscript j has' been suppressed since, the 
means are understood to be taken over the correct response category for each 
exercise. 



Each of these six' achievement* measures are computed and routinely used 
in reports describing achievement data for any assessment. The simple differ- 



ence in these 'measures between two- assessments of the same exercises (or sets 
of exercises) provides six measures of change in achievement that are routinely 
used in National Assessment's change 'reports. , The next section describes how 
standard errors are estimated for the 12 statistics routinely- iised in National 
Assessment reports... 



Computation of Standard Errors j 

In order to obtain an approximate measure of the sampling variability in 
the statistics (1) through (6),, a jackknife replication procedure for estimat- 
ing the sampling variance of nonlinear statistics from complex, multistage 
samples was tailored to National Assessment's sample design. References (4),- 
(5) and (7) provide information about the jackknife technique, while reference 
(3) describes how the procedure is used in estimating standard errors for " 
National Assessment's sample designs. - 

To demonstrate the computational aspects of this technique, consider esti- 
mating the variance of the statistic in •(!) — the proportion of age elig'ibles 
in subgroup i yyho would select response category j on exercise e. 

This statistic is based on data from all the n^^ PSUs in the H strata. Let 
^i-h'k ^e -^e^'ined as a replication estimate of P^"^ and cpnstructed from all the 
PSUs, excluding the data from PSU k in stratum h. ' These replication estimates 
are computed as if the excluded PSU had not responded and a reasonable non-. 
response adjustment is used to replace the data in P5U hk in estimating P^^". 
Several choices for replacing the, data in PSU hk are available. In order to - 
obtain a convenient and computationally efficient algorithm for approximating 
standard errors. National Assessment replaces C^^l^ and W^^^j^ from: the hkth PSU 
with corresponding sums from another paired PSU in the same stratum. The rep- 
licate estimate is then computed. The replicate estimates to be used in the 
calculations are determined by arranging all o^ the PSUs in each stratum into 
successive pairs. That is, PSU 1 is paired with PSU 2, PSU 2 with PSU 3^ 3 
with 4, ... (Nj^-1) with n^^ and PSU n^^ with PSU 1.' .. . 

" . . . ' ' ' ' • * 

The contribution to the variance of P?"^" by each pair of PSUs is the change 
in the value of the statistic incurred by replacing the data from each PSU in 
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the.pair with the data/' from the other PSU in the paif and recomputing P?^" in the 
usua-1 way. This prodilces two replicate estimates. Squaring .the difference 
between these repl icate estimates and then dividing by eight, measures the 
contribution of this pair of PSUs to the total variance. The sum of . these con- 
tributions over all nj^ successive pairs in the stratum is the contribution by 
stratum h to the total variance. The square root" of the sum of, the H stratum 
contributions is the estimate of the standard error of P^-^". - .. 

Algebraically, the two replicate estimates for the pair k, k+1 (where 
k=l, .... and n^^+l = 1) are: ' 

pej pSj pSj ■ , ^ 

n\ p^J • - i++ ihk ^S'hfk+l) " 

. Vhk(k+1). -^i^-Twe 

i++ ihk -ih(k+l) 



and 



r>ej _ pej , ■ ej 



i^-h(k+i),k ^.e _ e - • e- 
"i++ "iH(k+l) "ihk 

The contribution to t;ie total variance from ■^stratwm h is: 

. And, finally, an estimate of the standard errot of ?V is: 

(10) SE (Pf ) =• (Z var p^j)\ ' 
' Ji "'^ . 

Multiplying P?^' by 100 yields the percentage of response to category j.. 
Multiplying .$E (P^'^) by 100 yields the co'rresponding estimated standard error 
of the pe!-centage.. . > .. 

In general, the jackknifed standard er;:ors of the proportfc-?; estimates' 
wHl be larger t^ian the simple random sampling formula (pq/n)^, where p =>?'^", 



q = l-.p and n is the number of sampled respondents in subgroup i who took the 
exercise. The larger size of SE (P?"^) reflects mainly the loss of precision 
due to cluster-sampling, of schools and' students. " " 



The standard errors for the achievement measures (2) through (6) are com- 
puted through a series of steps analogous to those. fonowed in computing SE (P^ 



The most complicated step in computing standard errors occurs in forming 
the paired replicate estimates analogous to (7) and (8) for each successive, 
pair of PSUs. Once..this bookkeeping' chore is done, tlie computations for (?) 
and (.10) . follow' in a straightforward manner. 

The standard errors. for the differences between tWo assessments for any 
of the achievement measures (1). through (6) are computed as the square root 
-of the sum of the squared standard errors from. each of the separate assessments 

The size of the "standard errors depends largely on the -number of PSUs and 
schools included in the sample (Table A-l), but also'pn the-number of respond 
dentsv1n;each of the repdrting-groups . Table-A-2 shov;$- the average number of 
students responding to an. exercise package for each of the reporting groups 
discussed in this report, -for each'age and for each of the three science assess- 
ments."^^ . 

_ The.size of the standard errors of the rrjeans of the' achievement measures . 
for sets of exercises.. is also influenced by the number of exercises in the 
exer-cise set and the number of packages over which the Hems in the set. are 
-spread. Tables A-3 and. A- 4' show the number. of exercises and packages .included- 
in the mean achievement measure for each of the content" categories included'in- 
this, report. ; - ^ . -. 



TABLE A-2. Average Number of Respondents irt Reporting Groups Taking a 
. Package of Exercises, by Age and Assessment Year 



National 

Region . , 
Northeast 
Southeast 
Central - 
West 

Sex 
Male. ; 
Female 

Race • 
White 
: Black 

Parental educjation 
Not graduated 
high' school 

* Graduated "high 

school 
Post high school 

Type of community 

• Extreme rural 
Low metro 
High- metro 

Size of community 
Big city 

^- Fringes around 
big cities 
Medium city 
Smaller places 

Grade 

3, 7, 10 

4, 8, 11 
12 



^^^^ Age 9 Age 13 Age 17 In Schoolt 

iiZO 1973 1977 ^ 1969 1972 1976 1969 1973 1977"^ 

2,434 2,663 2,478 2,411 2,612 2,565 2,083 2,351 2,649 



618 
563 
574 
678 



1,231 1,328 1,245 1,166 1,294 1,268 
1,203 1,335 1 ,233 1,231 1,318 1,-297 



656. 


585 


625 


651 


587 


577 


669 


646 


562 


667 


708 


423 


672 . 


736 


.570 


649 


. 764 


507 


665 


510 


654 


-■645 


506 


577 



573 
596 
596 
586 



581 
673 
850 
545 



l,'01-3' 1,126 1,313 
1,070 1,225 1,336 



1,825 
390 


1,997 
466 


1,911 

3'9.1 


-1,799 
416 


1,977 
436 


1,940 
473 


..1,723 
237 


1,852 2,155 
.358 359 


^269 


271 


234 


' 361 


417 


336 


432 


. 455 418 


562 

- . 794 


' 564 
787 


. 68.1 
■ 729 


■ 762 
1,038 


-792 
, 994 


868 
1,01,3 


685 
931 


720 897 
1,028 1,212 


240 
• 243 
. 243 


255 
266 
267 


247 
252 ' 
241 ■ 


242 
' 243 
239 


263 
264 
260 


261 
263 
261 


206 
212 
' .209 


230 256 
239 266 
234 267 


665 


. 619 


617 


651 


583 


. 633 


558 


439 622 


403 
326 
1,040 


515 
372 
1,157 


461 
274 
1,126 


381 
371 
1,007 


531 
365 
1,133 


484 
218 
1,230 


334 
334 
857 


493 488 
326 274 
1,094- 1,265 



/558 646 575 581 693 .. 663 267 305 350 
1(, 779 1,946 1,855 1,728 1,809. 1,842. 1,446 1,688 1,977 
— — ■ — — -_ 323 304 286 



fSeventeen-year-alds Unrolled in school, excluding follow ups in the 197? 
assessment. ' " ... 
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'TABU A-3, Numbet^ W ftianje Exercises in %i(w5 Content Oategories and Number of ■ 

, Packages in Which W £){ercises Appeared i^ and 19?2.73, Ages 9, 13 and J7 

>f of , Nupber of If of Nymber of. - ^ Number ofTiier of 
,»si«^te ' MM packages ; Exercises Packaaes. 

MM:. 'B M'. '.^ ~ HTM 

■All exercises ' . ' % 8 7, ; , 9 9, ' 54 ' 

Ij/pe of science . ' > ' ' ' ■ ■ 

^8'°%' , , '8 7 ^3 8 ' 9, ' ' 20.' ■ lo 10 

■ Pliysical science. W 8 1 : '9 g ' 30 ■ , in n 

. Unclassified ; , If 8' 7 5 5 ■ 7 ' ., 5 5 

■1972-73 objectives • . ' ■ ' , ' 

■to' ... 8.. 6. ./ r ■ 9. 8' 34 ■ 11- 11 

: . Onderstand/apply,, ■ 5 8 . '7 28 9 '9 30 in ii' 

: Appreciate .j 5,5 ^ 2 2 2 . 0 ' 0 0 

Exercises, usecl in all , , ' 

f; three assessments • . ; : ' f -j" ■ ' J3 :8'' 8 ^ 23 ' ' 9, 10 




^TAJLE A4. Nirter ^ Kwsei^rcises io larioys Content Categories and Number of 
Packages in M 1\4 Wfcises' Appeared in im and 1976-77, Ages 9, 13 and 17, 



lAll exercises ■, 

Type of science 
■ Biology < 

Physical science. 

■Unclassified 

1972-73 objectives 
' Know \ • .. . 

Understand/apply . 

Appreciate. 

'■ ■ ' >i ■■ ■■ 

Exercises 'used in. all 
" tliree assessments 




E 



NuiAer of 



W] 1977, 



7 



7/ 6. 



ir- 1: 7 

■ 1 i 



7 7 

I '8 .7 
.3 ■ i 



Age 13 , 
Number' of Number of 
Excises Jad^ 
m 1976 

7.5 9 10 



23 



38 
37 

0' 



'9 



5 ■• , 4 



10 
10 

4 



9 lO' 
9 10 
0 0 



23 „ 8 10 



■ kR ■ 
iuiberof Number of 
' Exercises Pacbjies 

m m 

70 -11 11 



19 -^lO' 10 



45 
6 



11 11 
5 5 



31 
37 



II ■ 11 

11 ..,.11 



2 2 2 



^3 , loliO 
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'APPENDIX B ■ • 

ESTIMATED/POPULATION PROPORTIONS OF REPORTING GROUPS 
■ BASED ON NATIONAL ASSESSMENT SAMPLES, \ 
■1969-70, 1972-73 AND ^975-76 



- The estimated population proportions for reporting". groups, shown in this, 
appendix are based on weights derived from the sampling process used in the 
three assessments of 9-, 13- and in-school 17-year-olds. These proportions - 
vary from year to year due to-random sampling variability or systematic dif- 
ferences in sampling procedures. A better estimate. of population -proportions 
for any single year can be obtained by smoothing^ the proportions. over several 
assessment years. Smoothing does not make the estimated proportions identical 
but does reduce variability. The estimated population proportions shown in 
this appendix and used in estimating performance were obtained after smoothing 
proportions from the first eiaht years of assessment. The procedures used to 
obtain the: smoothed proportions' are detailed below:- 

" The purpose of smoothing, estimated population proportions is to.reduce' 
sampling-fluctuations that can'affec't estimates of the change over time in the 
percentage of ax:ceptable responses to an exercise. For e? ,r,,. le, the percentage 
of acceptable responses for an/age group is a function/ of (.>■. relative, propo*^ 
tions of hfgh- performing and Tow-performing groups. If the .-elative propor- 

.tions of these groups are very different in different assessments due to sam- 
pling variability, then a portion of the change in percentage of acceptable 

^ responses for an age group is directly attributable to yearly sampl in.g differ- 
ences in the-relative proportions of high- and low-achieving groups. Smooth- 
ing estimates of population proportions reduces a large portion of the sam- 
pling variability while preserving, as far as possible, actual trends occur- 
ring in the age population. . ' . 

The specific procedure used to obtain the smoothed population proportions 
that were. used in this report is detailed below. This procedure, which was 
applied independently to each of the three age groups, is basically a weighting- 
class adjustment applied independently to each reporting category (nation, 
region, sex, etc.). By^applying- this weighting-class procedure independently 



^The word 'ismoothing" is used here in the sense of. drawing a "smooth" curve to 
fit a sequence 6f numbers. Proportions for each reporting group covering eight 
years were smoothed by the robust/resistant procedures described in Chapter 7, 
Exploratory Data Analysis by John W. Tukey (Reading, Mass. : Addison-Wesley, 
1977) . . , 
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to each reporting variable, it was possible to produce good estimates of the 
marginal proportions of people. within each category of the variable, while dis- 
turbing. as little as possible the- relationships between other reporting vari- 
ables within the adjusted variable. - ' , 

The same, weighting-class partitioning of the population was used for all 
ages and reporting variables. For each age, the entire population of eligibles 
was partitioned into nine cells, called smoothing cells, on the -basis of mem- . 
bership in a variety of demographic categories determined by the race, grade, 
.home reference items and parental-education variables. The purpose of the ' 
partitioning was to obtain , subgroups of eligibles that exhibited substantial 
differences in performance on science exercises. In addition to differentiat- 
ing on performance, the smoothing cells were required to contain adequate 
nuipbers of eligibles to ensure stability of the weight adjustments. These 
criteria produced the smoothing cells detailed in Table B-1. ■ 

For each age and reporting variable, the population of eligibles was par- 
titioned into subgroups determined by the various categories of the variable 
and by the smoothing cells. For example, classification of the population by 
sex and. the smoothing cells produced, a partitioning consisting of 18 subgroups- 
males in smoothing cell 1, males in smoothing cell 2, females in smoothing 
cell 9., Estimates, of the proportions of.eligibles in each of the subgroups 
were then obtained for each of the. eight assessment years. The estimated pro- 
portion of eligibles in a particular subgroup for a given year was'computed as - 
the sum of weights. of respondents in. the subgroup assessed that year divided 
by the sum of weights of all eligibles. ' 

- This produced, for each subgroup, estimated proportions for each of the 
-eight assessment years.) Each such set of proportions was then smoothed to 
give a sequence of adjusted population proportions that tended to preserve 
actual. time trends in proportions while reducing the sampling variability of 
these estimates over time. The adjusted proportions were constrained by re- 
quiring that the sum of adjusted proportions across all subgroups for each 
y:r and reporting variable (formed by the categories of . the variabl e and the' 
srr:)otnng cells) total one. For example, the sum of, adjusted proportions for 
la a and female 13-year-olds in 1972 had to equal , one. 

The sum of the adjusted proportions across the smoothing cells for a given 
year and reporting category provides an estimate of the proportion of eligibles 
in the population, who were members of the reporting category. These sums are 
the proportions reported in Table 6-2. " 

Once, smoothed estimates of population proportions were obtained", respon- 
dent weights were adjusted so that adjusted performance estimates -coul d be 
computed. As explained in Appendix A, the percentage of correct responses is 
estimated- by. dividing the sum of weights for students responding correctly to 
an exercise by the. sum of weights for all students exposed to the exercise. 

Exercise- level performance estimates are affected by both year-to-year 
sampling variability and within-year variabil ity,, because each exercise appears 
in only one package and is administered to a relatively small fraction of all 



52 



■ J/^LE^l. Definition of MMng Cells for MjystingPopu^ 

to, . . ■ White White White White White White Blaci Black • Other 

' : lace 



Gradet ■ 




Paratal ekationtt 1, , im m m nTO k' 
Number/ of' home. .■ 

■ reference ifei ,, 4 " ' <|, ^ .a 



TABLE B-2. Estimated Population Proportions of 
National Assessment Reporting Groups for Ages 
9, 13 and ,17 in 1969-70, 1972-73 and 1976-77 



Reporting Groups 



Sex 
Male . 
Female 

Race . 

White 

Black 
• Other 

Region 
Northeast 
Southeast 
Central 
West 

Parental education 
Not graduated high 

school 
Graduated high school 
Post high school 
Unknown 

Type of community 
Extreme rural 
Low metro 
High metro 
Other 

Size of community 
Big city 

Fringes around big 
^ cities 
Medium city 
Smaller places 

Grade in school 
<3, <7, 10 

3, 7, 10 

4, 8, 11 
>4, >8, 12 

Other . . 



1969-70 
Age 9 AgeJl A^eJJ 



1972->73 
Age 9 Agejl^ A^eJ^ 



r495 
.505 



.843 
.133 
.024 



.251 
.213 
.295 
.241 



.103 
.231 
.341 
.325' 



.086 
.066 
.124 
.724 



.219 

.217 
.135 
.428. 



.013 
.232 
.731 
.008 
.016 



.498 
.502 



.851 
.132 
.017 



.245 
.223 
.291 
.241 



.154 
.314 
.412 
.121 



.096 
.088 
.118 
.698 



.21-8 

.207 
.144 
.431 



.033 
.239 
.715 
.011 
."002 



.489 
.511 



.876 
.109 
.015 



.244 
.196 
.303 
.258 



.210 
.326 
.416 
.047 



.088 
.095 
.140 
.677 



.223 

.235 
.142 
.399 



.0.^ 
-.125 
.724 
.133 
.000 



.499 .500 



.492 



.501 ,500 .-,508 



.808 .824 ,853 
.141 .128 ,112 
.051 ,048 ,035 



.260 
.224 
.275 
.241 



.095 
.220 
.325 
.360 



.085 
.077 
.126 
.712 



.209 

.224 
.139 
.428 



.010 
.230 
.747 
006 
.007 



.249. 
.225 
.284 
.242 



.14£ 
.307 
.406 
.138 



.095 
.077 
.118 
.710 



.193 

.232 
.142 
.433 



.027 
.246 
.717 
.010 
.001 



.268 
.198 
.292 
.242 



.173 
.317 
.463 
.047 



.081 
.096 
.121 

.702 



.183 



.252 
.143 
.422 



.017 
.127 
.728 
.127 
.001 



1976-77 
Age 9 ' "Age 13 Age 17 



.502 ,497 ,490 
.498 ,503 .510 



.812 
.128 
.060 



.252 
.225 
.273 
.250 



.090 
.246 
1323 
.340 



.092 
.072 
.102 
.734 



.179 

.201 
.146 
.474 



.006 
.232 
.751 
.006 
.004 



.808 
.135 
.057 



.2^7 

.2: ; 

.273 
.246 



.134 
.328 
.408 
.129 



-.103 
.071 
.110 
.716 



.173 

.185 
.132 
.510 



.021 
.251 
.720 
.008 
.000 



.836 
.116 
.048 



.249. 
.199 
.308 
.243 



.151 
.333 
.469 
.047 



.081 
.085 
.102 
.732 



.169 

.230 
.146 
.455 



.015 
.136 
. 749 
.100 
.000 



r 
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respondents. For example, in 1976 ten packages of exercises were administered 
to 13-year-olds, Smoothed population proportion estimates were based on 
25,653 13-year-olds, but each exercise-specific performance estimate is b^sed 
on the approximately 2,565 13-year-olds who took a particular package. Con- 
sequently, respondent weights were, adjusted to dampen both between-year varia- 
bility and package-to-package, variability within an assessment year, 

Respot;ident weights were adjusted separately for every reporting category 
by assessment year, age group and packagje combinations. To simplify the ex- 
planation, the adjustment process is described for male 13-year-olds who were 
administered package 1 (of 10) in 1976, The same process applies to all other 
combinations of reporting categories, ages, packages and assessments. 

Weight sums were computed for the male L3-year^-olds. (who took package 1 in 
1976) falling into each of the nine smoothing cells'^and converted to propor- 
tions by dividing by the sum of weights in all nine smoothing cells. An adjust 
ment factor was then computed for each smoothing cell by dividing the smoothed 
proportion for that cell by the package proportion for the cell, as shown in 
Table B-3. The weight for each respondent (male 13-year-olds who took package 
1 in 1976) in a smoothing cell was multiplied by the adjustment factor for- the 
cell . Adjusted performance- estimates were then computed with the adjusted 
weights. . - 

The result of the smoothing and weight-adjustment process is that the 
estimated reporting-group proportions are identical for all packages (and exer- 
cises) in a particular age group and assessment year combination. More impor- 
tantly, both adjusted performance estimates and changes in those estimates 
appear to be somewhat less susceptible to sampling variability, both across and 
within years. At the present time weighting class and other adjustment pro- 
cedures continue to be evaluated to determine whether the increased precision 
in performance estimates is large enough to warrant the considerable addition- 
al costs involved. 
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™iL!i. ?^°P°rtions aod Weight Mjustat 

1 in 1376 



Adjusted, smoothed proportions 
. forjale 13-year-oIds in. 
1976 

Unadjusted proportions for 
rale 13-year-oIds in 1976 



■ Sjootlm Cells ' ' 
m m Cell MJTOeiil Cell 7 Cell 8 Cell 9 



■;156 .208 ■ ,073 ,158. ,139. .047 M .054 



Height adjostment factdr 
: (adjusted/ynadjiisted) 



•154 .204 ,072 : . 143 ,136' , ,054 , ,096 .052 



1.022 l.Ori.018 1,019 1,104 1.022 ■,860, ,777 1,029 
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APPENDIX C ■ 
• \ CHANGES IN PROCEDURES BETWEEI^ ASSESSMENTS 

. As with any sample survey. National Assessment results are subject to both 
sampling and nonsampling error. Sampling errors occur because responses are 
obtained only from a sample,, not from the entire population.. Nonsampling 
errors are unwanted variations in responses that might come from many sources 
in an assessment: the arrangement of exercises in packages, variability among 
exercise administrators, differing motivation levels of respondents, errors 
in recording responses and errors in data processing procedures, among others 
When assessing change,, we hold constant as many conditions as possible so 
that the nonsampling errors in the first assessment will cancel out those in 
the second assessment when the difference in achievement is computed. 

However, it is not possible to control all sources of nonsampling error. ' 
Some conditions did change over the course of the three science assessments. 
This appendix describes changes in definitions of reporting variables, data 
collection procedures, "I don't know" responses and nonresponse to exercises 
Comparative data on released versus unreleased exercises is also included. 

Definition of Variables 
Parental Education . . ' 

The wording of the questions asking for level" of parents' education was 
changed slightly after the first assessment. In 1969-70 respondents were 
asked. How far did your father, or the man living in your -home who acts as 
your^ father, go- in school?" A similar question was asked about the respon- 
■ dent s mother. In subsequent assessments, the wording was simplified to- "How 
much school did your father, complete?" with a similar question about the 
mother s schooling. Only the form of the question was changed; the response 
categories were not. After the 1972-73 assessment, results for changes in 
achievement by parental-education" categories were not reported, in the main body 
of change repprts. However, because the results across the three assessments, 
have been highly consistent despite the change in wording, they have been in- 
cluded in this report. 

/ The proportion of respondents who did not report an education level for 
either parent has been high for 9-year-olds (about one- third) and lower for 
13- and 17-year-olds (about 10 and 5%, respectively). Achievement of respon- 
dents in the unknown parental-education category is always lower than for any 
other category Whether the .low achievement of this group reflects lower 

■ ■ ■ / ■ ■ ' 
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ability, lower parental interest or influence, motivational problems in the 
assessment/situation or some other factor is not known. 



Race '. 

In 1969-70, exercise administrators visually identified. respondents £s 
white, black or other. In 1972-73 and 1976-77, partially in response to per- ^ 
sons and organizations wanting information, about other racial or ethnic groups, 

.the exercise administrators were asked to classify respondents into one of five 
categories — white, black, Puerto Rican, Mexican-American or other ~ using . 
visual identification and surname. When there was a question; the administra- 
tor was advised to determine the language or dialect the student spoke. In - 

• all cases, Puerto Rican and Mexican-American identification took priority over 
other categories. 

The degree to which categorization into Puerto Rican and Mexican- American 
groups — taking precedence over racial identification — affected the racial 
categories themselves is not directly known. The proportions of whites and 
blacks have been within the range of sampling. variability,^ and group differ- 
ences in achievement were quite consistent over the assessments. There may 
still be a small effect due to the change in definitions. 



Community Size .. 

In all three assessments community-size definitions were based, on 1970 
census data.. Community characteristics have changed to some extent since 
1970. Because of annexations, migration, births, etc., some smaller places 
have become more like medium cities or fringes of big cities, while some 
medium cities have become more like big cities or fringes of big cities. Data 
from Current Population Survey reports^ indicate that the changes, while real, 
probably have not been large, enough to seriously affect results for National i 
Assessment categories. The 1980 census will provide more detailed data orK 
consnunity characteristics and -migration between various geographic subpopUla- 
tions. Analysis of population trends and their relationship to performance 
trends will be a major part of National Assessment's analysis and research 
effort in future assessments. 



^See Appendix B for estimated population proportions. 

^U.S. Bureau of the Census, "Mobility of the Population of the United States: 
March 1970 to March 1973," Current Population Reports . Series P-20, No. 262 
(Washington, D.C.: U.S. Government Printing Office, .1974); U.S. Bureau of the 
Census, "Geographic Mobility: March 1975 to March .1977,"- Current Population 
Refiorts, Series. P-20, No. 320 (Washington, D.C.: U.S. Government Printing 
Office, 1978). • . 



58 

■ ' ?3 - 

o 

ERIC 



Type of Community ^ 

* In each assessment, principals in sample school*; were asked to estimate th( 
proportion of adults in each cf the. following categones for . the school atten- 
dance area: • 

A. . - ProVa::s-:onal . ar<d '/lanagerial . _ 

B. Sales, cl'Siica: , technical and skilled 

• C. Factory aric. other blue collar 

D. Farm workers 

E. Not regularly employed 

F. Welfare " * . 

Missing data were estimated from 1970 census reports. 

Using these categories, rural, low-metro and high-metro indexes were then 
constructed for each school : 

Rural : . D-(C+2A) 

Low metro: E+F-A 
High metro: A-(C+CH-E+F) 

At each age, schools were excluded from the extreme- rural, category if 
they were not in the smaller-places community-size category or if. the principal 
reported that any students came from places of greater than 10,000 population. 
Remaining schools in this category that contained the 10% of the total sample 
highest on the extreme-rural index were classified as extreme rural. Only 
schools in big-city or fringes-around-btg-cities categories were eligible for 
the. high-- and low-metro classification. Eligible schools containing the 10% 
of the. sample highest on the high- or low-metro indexes were classified as high 
or low metro, respectively. 

The extreme-type-of-community definitions have proved useful in identify- 
ing a constant percentage of respondents that are likely to be from opposite 
extremes on a rural-urban continuum and, within urban schools, at opposite 
extremes. of a socioeconomic continuum.. The populations represented each year 
are si ightly different . The categories each, year .represent the most extreme 
10% of. students in. that year's sample.^ If a particular year's sample happens 
to be less rural than previously defined, for example, then extreme rural will 
cover a jess rural population that year. Alsoi the sample design used in each 
of the science assessments has defined and oversampTed rural and low-socioeco- 
nomic areas somewhat differently. To, the extent that National Assessment is' 
more successful in oversampling these areas, 10% of the.sample covers a smaller 
proportion of the extreme-rural and low-metro populations (and, conversely, a 
larger proportion of the high-metro population). 

■ \ • }■ . \ ■ 

One other caution should be observed in interpreting extreme- type-of- 
commurijty data. The older age groups' mean achievement is generally closer to' 
the. nation than is the younger age group's mean achievement. This phenomenon 
■might be partly./due to the larger size and heterogeneity of secondary-school 
attendance areas when compared to those of elementary schools. 



Data Collection. Staff 



Data Collection Procedures 



The first asisessment of science occurred during the 'f^irst year that 
National Assessment collected data. The second and third assessments of sci- 
ence iook place in the fourth and eighth years of data collection, respectively, 
by which time. several improvements in field operations had been made. For ex- 
ample, the field staff used in later assessments had more experience .and better^ 
training than did the staff in the initial assessment year. Better qual ity- 
control procedures were also implemented so that the field staff could be con- 
tacted quickly and instructed about procedural changes if. there were difficul- - 
ties in administration. 



Learning-Area Mix 

In 1969-70, science exercises were administered in packages also contain- 
ing writing and citizenship exercises; .in 1972-73, science was administered 
with mathematics. Most of the citizenship. and writing e;<ercises were short- . 
answer or essay exercises, while most of the mathematics exercises required 
respondents to compute and record their own answers. Most of the mathemati'cs 
exercises were short; the citizenship and writing exercises were longer. Al- 
though the total testing time (about 40 minutes, per respondent) was the same 
in each assessment,- responding to many. short exercises rather than a few long 
ones may have had^an effect on .performance;. The 1976-77 p^ickages contained 
only science -exercises, most of which were multiple-choice. For the first 
time in- a science assessment, most packages contained expedience or attitude 
inventories. , 



Taping . 

There were some slight variations in the tapingof th^ exercises. New 
tapes were made in each assessment because different combinations of science 
exercises and learning areas were assessed each time. A different announcer 
was used in the first assessment than in the last two, but in each assessment 
the announcer read cl.early and at a constant rate. . Tap^scHpts for change ex^ 
ercises were kept .as constant as possible (including er^or^j , but there were 
slight changes in the introductory remarks, transitional r^arks between exer- 
cises, and instructions on the use of the "I don't know" response'. 

. All 1972-73 taping. conventions were replicated as closely as possible on 
exercises for the 1976-77 assessment. In the second assessment, the announcer 
said, at the end of each exercise, "If you do not know the answer, please fill 
in the oval beside 'I don't know.'" Slightly different conventions were used', 
for new exercises in 1976-77. "I don't know" was read immediately after the, 
other response choices at age 9; it was not read at all at ages 13 and 17. To 
minimize the effect of these changes, old exercises w^qj:^ clustered at either 
the beCiinning or end.of each package. They were not segregated in separate 
packages because of the increased precision of summaries when exercises' are ' 
spread oyer multiple packages. 
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Print/jpg . j . . . 

InstrucCions on the bottom of each page telling respondents to stop or to 
continue on to the next page.were given added visual emphasis in the second 
assessment: "Stop" appeared in an octagon and "Please Continue on the Next 
Page appeared in anarrow. In addition, there were slight changes in the 
sizes of type faces -used, in the two assessments. Both sets of printing have 
been judged by National Assessment's reading consultants to be easily readable 
at. the appropriate age levels. Printing was essentially identical in the 
second and third assessments. 



Mode of Administration 

• ^.u 1?°!^?^ the assessments conducted by National Assessment have contained, 
both individual (one-to-one interviews) and group administrations,. All exer- 
cises used to measure changes in achievement between the first two assessments 
were group administered. In 1976-77 alV exercise packages were group admin- 
istered. However, two exercises . used to measure change between:i973 and 1977 
were administered individually to 9-year-olds in 1973. Exercise '202029 asked 
whether water would weigh more, the, same or less when frozen. Respondents 
were then. asked to explain their choice of more, the same, or less. Only the 
multiple-choice portion was used in change summaries. Exercise 202072 was a 
multiple-choice exercise that required students to pick the picture of a can 
that might contain^ botulism poison. Changes in the percentage of correct 
responses to both, exercises were negative (-8.5% and -3.1%, respectively), 
while the average change for all exercises was essentially zero. However, 
neither change figure appeared to be unreasonably large when compared with 
changes for the other exercises. 

In the first two assessments, group administrations were limited to 12 
students. In 1976t77, the planned average group size was set at 16, with a 
range of 10 to 35 students. Some problems, with overcrowding were encountered 
ih the larger sessions. 



Position- in Package 

In both the second and. third science assessments, science exercises were 
reassigned to assessment packages.. In 1^72-73, new and old science exercises 
were mixed wi th mathematics exercises. In packaging exercises. National Assess- 
ment sta.ff attempted to balance difficulty level,- objective, content type and ' 
other variables across packages with the constraints of fixed total assessment 
time for each package of exercis&s, and no exercise in, a package could provide 
the answer to anr other exercise; — In pr ' epdrd t ion fDr-t)re-1976-77 assessment, 
nearly all change exercises from a 1972-73 package were put together in either 
the beginning or end of a 1976-77 package. There were some exceptions due to 
differing numbers of packages between assessments and other constraints. 

_ Having all change exercises at the beginning or end of. a. package represent- 
ed a major departure from prior assessment practice. If there were biases 

• ■ ■ / ... ^? ■ 



associated with pacl^a^^ iQcation, the validity c^f ^hange measures would be 
jeopardized. It ha^ l^eifii} suggested that exam1 n^e^ nn'ght do poorly on the first 
exercise (or exercise?;) in a testing situatior) Hc^Use of the initial tension 
examinees sometimes e)C^perience.^ In addition, l>«ef^'ortnance on the last exer- 
■ cise (or exercises) ^i 9lit be lower than expect; K some examinees do not have 
time to complete th^/n- . / 

National Assess/n^nt attempts to control ^^fects of exercise position 
in a package by .pres^n tirig an audio explanatio^fi What the assessment is, hew 
results. will be used^ >aiid one or.more example ^^^fC^sez before actual assess- 
.ment begins. Furthe/. ^J'ercis^es are presented "Vi audiotapes to pace respon- 
dents through to the e^d ^'f the packages. 

After the secorit^'^sei^nce assessment, Natit;?!)^! '^.^ssssment staff analyzed 
results for exercise^ "that were first or last 4^ e ft^ckage in either assess- 
ment. There was litf'^ V^lationship between p^^s^'ti^h and changes in achieve- 
ment. . A small cont/oiled e)<pe.riinent on positfoh ah<j format was included in 
the 1973-74 assessme^it of Vfriting and career a/id occupational development. 
Even with approximately ^>S0O respondents per fr^^tti^nt condition, no system- 
atic position effect^ i^ev-^ detected..^ 

Position-in-pac|^a^e Effects between the s^cc^^d ^nd third assessments were 
investigated by divi^ng Packages into thirds classifying -exercises by .loca- 
tion, in the second at^tl th^^d assessments. Exe/f^ise administrations in the "last 
part of a package ar^ |ometiiTies lost when sessfoOjs start late, schools close 
early, etc., so. mean^ Joi" both correct respons^.^^d no n response were computed 
by packagie location." NiirnP^rs of exercises, me^n changes in percentages of 
correct responses, ancjj^tanclard deviations^ are 1' stfcl in Table C-1 and plotted 
in Figure C-1. The statistics for nonres|ycr^se are listed in Table C-2 



I.J.: Prentice 



,^R.L. Ebel, Measurinc^ J^dij^ati pnal Achievement (;jEf9l^Wood Cliffs 
Hall', 1965). 

"See Science Technicaj l^Rgt^^--'-' Sjifmarj^^ump !^^i?(?)-t 04-S-21, pp 100-104 
for detailed documentor, i.iis position in p^c^k^jje analysis (available from 
•Nati.pnal Assessment c^r '/cevj. ... . 

^N. Burton. et at., "Tf^ Effect of Position and ^oViriet on the Difficulty of 
Assessment Exercises,*' 4)ep^r presv^nted at the Meeting of the American' 

Educational Re.s.earch /^s^ocUtlon, San Franc isco/^p^.tl .1976. 

^Standard deviations ^r^ 1.f>cluded as indicators otF th^ variation in exercise^- 



level changes, in each c^ll; Because not all ex^r^^is^^ were administered to all 
students,, they are no^^VUi^l statistics for tesj?lhg cllffg^ences. between cells. 

'The nonresppnse. repov^tgd in the appendix is no^j^f^sponse to exercises- for respon- 
dents -who were presents **or package administrati^^"^. f^ailure to participate in • 
tne assessment because ftf School or student refi,^s^l is treated In Appendix D 
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■ ^^^^^S^hu "/l!?"^' P^™?^9es of Correct Responses Between 1972-73 and 
, m-^ll hy Package Location in 1972-73 and' 1976-77. Ages 9, 13 and 17 

1976-/) 
•Package 
Locati^ 

First ly^ Heancfiange, (s.D.)t 
rtiimber of exercises 



Last 1/3 



Total • 



13 First 1/3 ; 



Last 1/^ 



Total 



Mean change, (S.D.) 
Number of exercises 

ten,, change, (S.D.) 
Wumber of exercises 

fell change, (S.D.) ■ 
Nyinber of exercises 

' ten change', (S;D.) 
' kmkr of exercises, 

taf] cHange,^ ($.D.) 
(libber of exercises ' 



17 ■ 'First 1/3 ten change, (S.D.) 

iuiber of exercises 

■ Last \n ' Mean change, (S.D.) 

' iiber of exercises 



Total 



ten chatigev(S.D.) 
ki}^r of exercises 



First 1/3 


.,1972-73 Packaqe Location 
Middle 1/3 Last 1/3 


Total 


3.5 (3.0)t, 
'13 


, .1 (3.2) 
13 


1.8(4.3) 

13 


1.8 (3.7) 

, 39 


-2.2(5.2)-^ 
9 


.4 (4.9) 
11 


-3.6 (3.5) 
12 


-1.9,(4..7) 
32 


l.I (4.9). 

. 22 


.2 (4.0) , 
24 


- .8 (4.7) 

■25 


.1 (4.5) 
71 


- .1 (3.3) 
11 


- .2 (4.0) . 
14 


1.5 (2.7) 
.11 


.-.3 (3.4) 
36 


-2.4 (5.4) , 
12 


-1.7 M 

- 10 


"1.5 (6.5) 
17 ■ 


-1.8 (5.8)' 
39 


-1.3 (4..6) 

23 , 


- .9 (4.6) 

24.' 


- .3 (5.5)' 

28 ' 


- .8 (4.9) 

75 . 


-1.3 '(3.0) , 
■15 


-1.9 (3.6) . 

15. : 


-1.1 (3 5) 
13 


-1.4 (3.3-) 

•43 


-2.2 (3.2) 
9 


- .9 (3.5) 

7. 


-3.0 (3.0) 
11 


-2.2 (3.2) 

27 . 


-1.7(3.0)' 
24 


-1.6 (3:6) 
■ .22 


-1.9 (3.4) 
24 


-1..7 (3.3) 
70 



FIGURE C-1. Mean Change in Percentage of Correct Response From 1972-73 
to 1975-77 by Position in Package, Ages 9, 13 and 17 • , 
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^^^107.^1; P^"^? ''e^centages Of Nonresponses Between 1972-73 and 
' 1976-77 by Package Location in 1972-73 and 1976-77, Ages 9, .13 and 17 ' 



. 1976-77 
Package 
Location 



• 1972-73 Package Lo cation 
-First- 1/3 Middle 1/3 Last 1/3 



Total 



■ 9 . ' First 1/3 


Mean change, (S.D.)t 
Nuiober of exercises 


.7 (3.3)t 
12' 


- .1 ( .2) 

■ 13 


- .3 {- .8) ' • 

13 . 


...0(2.0) 

■ 38' 


" Last 1/3 


Mean change, (S.D.) 
Number of exercises ' 


- .3 ( .6) ' 

9 


. .3(1.4),, 
10 


.8 ( .3). 

. - J2 , 


.3 (1.0) 
31 


. 1 u la 1 


Mean-change, (S.D.) 
■ ■ Number of exercises' 


.312.6)' 

ir , 


.1 (1.0) 
23 


.2 ( .8)' 
25 


.1 (1;6) 

69 


13 .. First 1/3 


Mean change, (S.D.) 
Number of exercises. 


- .2 ( .8) 

11 ■ 


.3.( .4)' 
■ 13 


■1.8 (1.3) 
11 


- .8 (1.1) 
35 


' Last 1/3 


Mean change, (S.D.) ■ 
Number of exercises 


.3 ( .7) 

11 


. .8(2.1) 

10 ■ 


4) . 


.2(1.9) 

■ " 38 


1 U Id 1 


, Mean change, (S.D.) 
Number, of exercises 


.0 ( .8) 

22 • 


■ .2 (1.5) 

23 .; 


- .8 (2.2) • 
28 


■ .2 (1.7) 

73 


17 First 1/3 

' ' '■ * . 


Mean change, (S.D.) 
Number of exercises 


: ^4 (■ .5) . 
15 , 


- .9 (3.5) ■ 

: 13;^ 


- ■ .6 ( .4) 

13 , 


- .3 (2.0) 

■ 41 


Last 1/3 ■ 


. .Mean change,: (S.D.) 
Number, of exercises 


-'1.2 (.5) 
9 


i.6( .7) • 


L9.(..7) ' 
11 


1.6 ( .7) 

27 


' ■•Total 


Mean change, (S.D.) . 
■ Number of exercises'. 


. .7(i)-. 

, • .24 


lO (3.0) 

■ -20 


• .5 (1.4) 

. 28 


.4(1.9) 
■ 68 



^S,d, = Standard deviation, 
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FIGURE C-2. Mean, Change in Percentage of Nonresponse From 1972-73 
to 1976-77 by Position in Package, Ages 9, 13 an^U 
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AGE 17 
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•^H^J^^J*^^ i" V^^^^ .^^ expected that P:^'ord.c»s either at the beginnina 
or at the end ot a package in both ajsessments I'/oiild provide the best control" 
0, nonsamplmg, error. Results were not always con^v-jstent with this expecta- 



. At age 9, exercises appearing in the first third of a package in both ' 
assessments^ had a positive change of3h percentage points, while those appear- 
ing in the last third of a package in both assessments had slightly larger 
negative change. At age 13,, mean changes for exe appearing in the first 

third of. 1975-77 packages were less negative thar arcises appearing in 

the last third of 1976-77 exercises.. At age 17, nces between means 

were small and inconsistent. , - 

The relationship between changes in percentages of correct responses and 
position m package wa.i --fficiently clear to merit further action. The 
.numbers of exercises in : t.-s positions were well balanced across the second 
and third assessments, c ■ ■:;es were not randomly assigned to location in 
either assessment, and Uv . /-ferences observed might have been confounded bv • 
cohtent or some other unknown variable. ' 

Changes in nonresponse were slightly but consistently higher for exercises 
appearing .in the last third of 1976-77 exercises-. The only exception occurred 
ac age 9, where a large increase in nonresponse to one open-ended exercise 
caused a reversal in mean changes for exercises in the first third of 1972-73 - 
Deleting tnat one exercise makes the two means identical. After inspecting 
mean changes in nonresponse (Table C-2 and Figure C-2) and exercise-by-exercise 
plots or changes ;.n position in package, nonresponse was dropped from "further 
consideration. The rates of nonresponse were too small and too unrelated to 
package .oc-ition to merit any adjustment of correct response change statistics. 

"I Don't Know" Responses and Nonresponse 

_ , National Assessment emphasizes to respondents that, it i:; not a. test in the 
usual sense and scores are not reported for individuals. Exercises are present- 
ed on audiotapes to help ensure exposure to all exercises, and the response 
choice I don't know" is included among the possible choices on all cognitive 
multiple-choice exercises to minimize guessing.^ 

t 

iQ^Q In^^^^Tnv'o"!?^^!"^ ""^^^ percentages of "I don't know" responses in - 
1959-70 -and 1972-73' for exercises used to measure changes in achievement be- 
tween the fi -St two -assessments, and similar data for exercises used to measure 
changes in achievement between the second and third asses<^mpnts 



^N.. Burton et al., "The Effect of Position and Format on the Difficulty of 
Assessment Exercises." ^ 
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TABLE C,3. Mean Percentage Responding "I Don't to" in 1969-/0, 1972-73 and 
19/6-77 for Exercises Osed to Heasure Change From I969-70'to 1972-73 and ' 
, ■ From 1972-73 to 1976-77, Ages 9, 13 and 17 



Nation 

Region , 
Northeast 
■ Southeast • ' 
Central 
West 

Sex 

" Bale 
Female 

Race 
White 
, Black 

Type of coiiiinity 
Extreme rural 
Low iiietro 



Age 9 Percentages ■ 
92 , 66 
Exercises Exercises 

MM MM 
6.2 .6.0 7.6 10.3 



6.0 5.4 7.1 ... 

6.2 6.8' .8;1 11.9 

6.15.6 6.910.6 

6.5 6.1 7.9 9.9 



,5.2 5.1 6.5 9.2 
7.2 6.8 8.7 11..5 



5.9 5.6 , 7.1 
7.4.7.1 8.9ii.8 



6.7 6,5 • 7.710.6 
7.-7 7.6. 9.614.0 
4.4 4.7 6.4 7..5 



Age 13 Percentages 

■ .67. , 69 ■ 
Exercises Exercises 

MM WlM 
W. 8.r , 7.4 8.7 



6.5 7.8 6.9 8.0 

6.1,7.3 . 7.2 8.0 

6.9 8.5 8.1 9.1 

8.1 8.5 7,:- 9.8 



6.1 .7.1 6.5 7.6 
7.7 .9.0 ■ 



6.6 7.6 7.1 
7.9 9,9 9.3 



8.3-7.5 ■7.2 9.7 
8.8 10.4 9.4 9.0 
6.6 -a.O-' 7.2 8.1 



Age 17 Percentage s 
63 . 64 
Exercises ■ Exercises 

MM ' MM 
12.9 14.0 ■ 9.9 11.4 



12.113.5 ■ 9.710.3 

12.0 13.3 9.6 11.8 

13.7 14.3 9.9 11.6 

13.2 14.6.' 10.412.1 



11.3 1,2.2 8.6 9.9 
14.415.6 11.3 13.1 



12.9 13.6 9.5 11.1 

12.3 15.3 ll.-l 13,6 

12.614.4 10.012.4 
12.8 15.5 11.1 12.1 
12.5 13,2 9.2 10.1 
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There was a slight but fairly consistent increase in usage across the 
three assessments. There was an increase in "I don't know" responses with age 
on exercises used in both 1969-70 and 1972-73; that trend is not apparent on 
exercises used in both 1972-73 and 1976-77. Reporting-group usage of the "I 
don't know" response mirrored achievement trends fairly closely. Sex, race 
and community- type differences were all the opposite of achievement differences, 
while the pattern ror regional groups is hot clear. 

Table C- 4 shows the mean per^jntages of nonresponse in 1969-70 and in 
1972-73 on exercises used to measure changes in achievement between the first 
two assessments, "and similar data for exercises used to measure changes in 
achievement from 1972-73 to 1976-77. The mean percentage of exercise nonre- 
sponse. ranged from approximately % to 1% across all ages and assessments. Non- 
response for various reporting groups tends to mirror achievement patterns. 
For example, blacks and low-metro students have somewhat higher nonresponse 
rates than whites and high-metro students, just the opposite of the achievement 
results. The trend is less clear for regional and sex groups, where achieve- 
ment differences were smaller than for race and type of community. 



Released and Reassessed Exercises 

Most, National Assessment change measures are based on exercises that have 
never been' released for public use. The 1969-70 to 1972-73 'change sumnviries • 
contained 10, 6 and 4 previously, released exercises at ages 9, 13 and 11^ 
respectively. Analyses of changes in achievement on released versus 'inreleas-^d 
exercises vyere inconclusive.^ At ages 9 and 17, achievement on released exer- 
cises declined at the same rate as that on unreleased exercises, while at 
13, the changes in achievement were generally positive for both types of exer- 
cises. However, one of the exercises showing a large, positive change at age 
13 exhibited a large, negative change at age 17. 

A number of previously released exercises were included in the 1975-77 ° 
science assessment; however, none has been included in change summaries. Almost 
all of those previously released exercises were released after the 1969-70 
assessment. Change results for those exercises 'and unreleased exercises ''om 
the 1969-70 assessment are shown in Tables C-5 to C-7.^° Because differeirial 
hanges have been observed in biology and physical science exercises, r;\sults 
are given by -type of science as well, as for all exercises. 

For all exercises, changes in the percentages of correct respons^^r. beh' pr; 
1969-70 and 1976-77 were quite similar for, released and unreleased exercises. 



See Science Technical Report: Summary Volume , Report 04-S-21, pp. 105-108 
for additional details. 

^"All results were computed prior to weight smoothing; change statistics for 
unreleased^xerc. ;es differ slightly from those reported in Chapter 2. 
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TABLE C-i h Percentage of Konresponse in 1969-70, 1972-73' and 1976-77 for 
Exercises Used to Measure Change From 1969-^0 tn 1972-73 and 
From 1972-73 to 1976-77, Ages 9, !3a^d 17 , 

■ ; . MPercentajes Age^ 13 Percentages /!ge 17 Percentage^ 

»5 , 67 ,68 "64 62 

txercises- Exercises Exercises Exercises Exercises Exercises 

10 1973 ,1973 1977 WW Wm Wm WW 

■ '4' 'S' .6 .8 ■ .2 ,,8 1.1 ,9 .3 U .8 
fiegion . , ■ ' 

'3 .6 -.5 .7 ■ .3 -.3. 4 6'- 2 8 1! Q 

Sex 



Nation 



Race 
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.5 L3 •• 
.3 2,5 , 1.3 1.3 



-5 .6: .8 .9' .2 9 12 9 ? n • o n 



fx ■ '2 -5 .1 .7 1.0 .6 .2 9 7 9 

m 1.6 1,1 1,6 2.2 ..4 ,9 1.5 2.4 j 

Fype cf conunity ' ' 

■ Extreme rural ' ,i .4 .5. .6 ,3 1.4 17 3 . ,, ' ■ 

iK^^^ H.H -•^2.0' 3:0 4> ;4 2;1 u d 



•! ,2 L6 



1.2 .5 



TABLE C-5. Mean Changes in Percentages of Correct Responses From 1970 to 1977 for 
Released and Unfeleased ..Exercises by Content Classification, Age 9 



Classification ' 



Siology 




Unclassified 



Total 



. •Number ; Change Change ' 

of Exercises ; 1970 to 1973 .1973 to 1977 



Total Change 



Released after 
1970 assessment 
Standard error 


„ .5' ' 


- .3 

(1.0) • ■ 


-1.4 ■ 

( g] ' 


-1.7 

il.U) 


Unrel eased 
Standard error 


. 11 


■ -1.1 
( .7) 


.8 

I '0/ 


. - .3 

I .0/ 


Released after 
1970 assessment 
Standard error 

1 


8' 


,-2.3 

11. OJ 


-0.1 
(1.1) 


' -2.4 
(1.0) 


Unrel eased 
Standard error 


16 ' 


-2.4 
( .7) 


-2.3 

i 1) 


•4.7 

{ i\ 


Released after 
1970 assessment 
Standard error 


0 


i.t. . 


™ m 

m m 


■* M) 
" Pi 


Unrel eased 
Standard error 


3 


1.6- 

fl 0] .■ 


3.1 

1 1.1) . 


■ 4.8 

/l 1 \ 

(1,1) 


Released after 
1970 assessment 
Standard error 


13 


-1.5 . 
(.8) 


. - .6 

( .9) 


■ -2.2 ■ 
. ( .9) 


Unrel eased 
Standard error 


30 


-1.5 

, . ( .6) , 


-.6' 
( .6). 


-2.1 
(.6) 
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TABLE. C-6, Heari Changes in Percentages of Correct Responses From 1969 to 1976 for 
Released and Unreleased Exercifes by Content Classification, Age 13 ' 

■ ■ , taber Change Change Total Chanoe 



Biology . 



Total 



t\ciedsea aiier 
1969 assessment . 
Standard -error 




■ -2.0 
(1.3) 


' -1.8 

■(1.4) 


-3.8 
(1.4) 


. uiireiedseu 
Standard error 


7. 


■ .7 ■ 

(8) 


2.2 
( .8) 


1.5 
(.9) 


Kcieasea atier 
1969 assessment ■ 
Standard error 


9 ■ 


. -T.6 
(.9) 


-2.1 ■ 
( .9) 


-3.7 

,(.9) 


ufireiedSeQ 

Standard error 


13 . 


-3.3 
( .8) 


. -2.1 

( .8) 


-5.4 
( .7) 


lAcicascQ aiier 
1969 assessment 
Standard error 


1 . 


-4.1 
(2.4) 


-3.6 

.-.(2.4) ■ 


-7.7 
(2.3) 


uiircieaScu 

standard error 


. 3 


.4 

. (1.2) ■ 


-9.5 . 
(1.4) 


-9.0 
(1.4) 


Keiec-ed after 
1969 assessment 
Standard error 


-14 ' 


■ -1.9 
(.9) 


-2.1 
(1.0) 


-4.0 
( .9) 


Unreleased 
Standari error • 


23 


-2.0 
(.6) 


, -1.8,. 
(•?..) . 


-3.8. 

( .7) . 



TABLE C-7.' Mean Changes in Percentages of Correct Responses From 1969 to 1977 for 
■ - Released and Unreleased Exercises by Content Classification, Age 17 



tussification. 



Biology 



ERIC 





■Number 
of Exercises 


> Change 
1969 to 1973 


Change 
1973 to 1977 


Total Chanqe 


Released after 
1969 assessment 
Standard error 


• 1 


-3.8 

■ ,(2J)' , 


-8.2 ■ 
(2.0) 


-12.0 
(2.4) ^ 


Unreleased 
standard error 


8 


. -2.6 ^ ■ ■ 
( .7) 


- .6 . 

, (.7). ■ 


-3.3 


Released after' 
1969 assessment 
standard error 


,5 


-3.0 ■ 
(1.0) 


■ -3.3 
(1.0), . 


-6.3 

tl.l) 


Unreleased 
otancara error 


•13 


-2.7 

( .8) ■ 


, -3.6 ■ 
(.8) 


• -6.3 

( .8) 


Released after . 
1969 assessment 
Standard error 


2 


-5.4 ■ 
(1.6) 


\ 

-1.7 " 

(1.'5) 


■7.1 ■ ■ 
(1.6) 


Unreleased 
Standard error 


2 • 


.-4.2 , 
(1.2) 


-J 
(1.0) , 


-4.8 ' 
(1.2) ' 


Released after 
1969 assessment ' 
Standard error 




-3.7 

■ ( .9) , 


. -3.5 

'( .9) 


. -7.2 
(I.O) : 


Unreleased 
Standard error • 


2.1 . 


•2.8 

■ ( .;6). 

■ 80 


-2.3 . . 
{ .6) 


' -5.2 

( .6) ■ 



When classified by type of science, results are less consistent. At all three 
ages, released biology exercises showed larger declines than did un-eleased 
?I?:°5^^u^^''S^"^- J" physical science, declines were somewhat lower for re- 
leased than for unpeleased exercises. There were very few unclassified exer- 
cises, and changes on these were quite similar for released and unrel eased ex- 
ercises at ages 13. and 17. . Thus, for the small number of released science 
exercises on which National Assessment has repeated change measures, release 
tor public use may not have had much effect on the percentage of correct re- 
spons6s* 

National Assessment has no control over the use of released exercises. At 
any time, a specific exercise may be used in other assessment or testing pro- 
grams or reproauced in newspapers, journal articles or textbooks. If large 
numbers of students are exposed to the exact content of the exercise, it is 
irreparably contaminated for measuring changes in achievement. Consequently. 
National Assessment's reuse of previously released exercises is minimal 
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APPENDIX D 



NONRESPONSE IN ASSESSMENT SAf^PLES 



In addition to sampling variability,, .estimates of population values com- 
puted from sample surveys might be ^subject to random error and systematic bias. 
Systematic bias, or nonrandom error, might result, from estimation procedures, 
errors inherent in measurement and data collection procedures, and nonresponse. 
Sampling variability and random error. are discussed. in Chapter 1 and nonrandom 
errors are described in Appendix C. This appendix examines nonresponse in the 
1969-70, 1972-73 and 1976-77 assessments- Since nonresponse rates at ages 9 
and 13 are relatively small, the following discussion concerns 17-year-olds' 
response. rates only. 

Bias due to nonresponse is present in virtually every sr.J)?pTa survey but 
, IS frequently ignored since it is difficult to estimate its sizd. A variety 
of factors contribute to nonresponse. Nonrespondents might either be difficult 
to notify or reluctant to participate once they are notified; some might be 
absent from school _during the entire contact period with item administrators. 
However, these nonrespondents can be important , since, if they respond differ- 
ently than did the people actually included in the sample, estimates of per- 
centage based so'iely on the sample are^b-iased .and not properly rep; asentative 
of the age populiation being assessed, 

To provide some information about the size of the bias due to nonresponse 
in National ^Assessment surveys, the Research Triangle Institute, Raleigh, 
North Carolina, was asked to conduct a special .study of nonrespondents during 
the 1972-73 assessment of science and mathematics. The study was conducted 
on the age population of eligible 17-year-olds who, at the time of the assess- 
ment, were isted as enrolled in scl.ool . Sonie of these students, in fact, 
were no longer attending school at the time of the assessment. Eligibles had 
to be English-speaking, physically and emotionally able to respond to exercises 
as administered and not residing in an institution. 

The results of the nonresponse study^ indicate that 17-year-alds listed 
as enro'Jed in schools but not appearing at the designated tiiae of assessment 
can be divided into two different groups. The first group of nonrespondents. 



^W.D. Kalsbeek at al . , No Show Analysis ^ Final Report (Raleigh, N.C.: Research 

Triangle Inst jte, 1975) J W.T. Rogers et al ., "Assessment of Nonresponse Bias 

in Sample \ ^s: An Example From National Assessment," Journal of Educational 

Me asurement :. V . 14, No. 4, 1977. 
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which comprises about 80% of the total nonrespondent group, aid not appear for 
the assessment because of conflicting school activities or illness. The per- 
formance of this group was not very different from the performance of students 
assessed during the regularly scheduled sessions. The second group of nonre- 
sppndents, which comp.'ises about 20%. of the nonrtspondents, do not appear to be 
available in the schools at any time. They attend infrequently if they attend 
at all (for practical purposes they have dropped out of school), or they have 
moved out of the school attendance area. In either case, -these students should 
probaLi,/ not have been listed in the in-school population of eligibles. This 
group, in contrast to the group of nonrespondents who were in fact attending 
school, performed more poorly on assessment questions, than students assessed 
during the scheduled sessions. 

The weights u£3d by .National Assessment to estimate the percentage of 
acceptable responses are adjusted for nonresponse. The adjustment assumes 
that the nonrespondents would perform, on the average, in a manner similar to 
those who did respond.. However, the nonresponse study showed that the second 
group of nonrespondents, those enrolled in but not actually attending school, 
typically performed at a lower level than either those who did respond or the 
f.rst group of nonrespondents. If the second group is included in the popula- 
tion of eligibles, the nonresponse adjustment' procedure used by National Assess- 
ment would result in overestimates of the. true percentages of acceptable re- 
sponses. ; 

Because the second group of students is effectively no longer attending 
school r It does not seem appropriate to include them in estimates for 17-year- 
°i%]".^chool. Thus, these students are not considered part of the population 
of eTigibles and are excluded from the computations of percentage of the sample 
covered for 17-year-olds shown in Table D-1. 

Including the second group of students. and then reducing bias due to their 
nonresponse would require the location and testing of some of these individuals. 
The difficulty and costs associated with supplementary data collection of the 
nonrespondents not actually attending school are so great that this has not 
been a feasible alternative in recent years. 

NationalAssessment continually evaluates its field procedures and has 
introduced. new methods to lessen the effects of nonresponse. In the second 
an(\ third assessments of science for 17-year-olds, item admii.istrators used 
the; day following a regularly scheduled ai::ossment session to locate and assess 
nonrespondents. This helped to reduce the bias due to nonresponse of students 
enro-M-ed in and attending school. 

/ - 

.However, systematic bias in change measures can be introduced if the use 
of new procedures results in very different samples in different assessment 
years. Thus, measures of change from previous years are still based upon sam- 
ples obtained. using the old procedures. Measures intended for jjse in deter- 
mining future changes are based on samples obtained using the new proced-jres. 

Table D-1 shows the average sample coverage per package (booklet) of exer- 
cises aofaimstered in 1969-70, 1972-73 and 1976-77. The rate of coverage is 



TABLE D-1. Number of Students Assessed and Percent of Sampl 
Covered by, Age, Assessment Year and Type of Admim'stratron 



li?! ■ Mi Type of 



.1976-77 9 ■ . G. 
13 . G 



17 . • ' G w/o F** •, 
Gw/F** 



Packages 


of 'Students .; 
Assessed 


Assessed 
Per Package 


• 8 
2 . 


19,468 . 
3,713 - 


2,434 
1,856 • 


9 
3 


21,696 
5,568 


2,411 
1,856 


11 
2 


' 22,913 ■ 
3,328' 


2,083. 
1,664 


7 

3 ■ 


18,638 
. 6,766 


2,663 
2,255 


9 

■3 


23,307 
6,744 


2,612 

2,248., ■ 


11 ' 
3 


25,865 
6,500 


2,3!)i 

k J w w X 

2,167 


7 


17 

1/ J OHO 


^,478 . 


.10 


25,653 


2,565 


11 
11 


29,140 
34,514 


2,649 
3,137 



Coverage 
in Percent 

1969-70 9 6 

I : • ■ ^ ^Moo . Z,434 ■ , 88.0tt 

89.1tt 

\ \ 2,411 , 85.6tt 

87.2tt 

^' I ■ ■ ^o'oL^' ■ 2'083. . 74.5 ■ 

71.2 

■ 89.3 

13 G 

^ ^ ^ " ""^ 85.5 

5 ^5 ^J.f^ 2,3Di , 73.6 

77.2' 

88; 6 

86.2 



73.1 
83.7 



"i . 95 • 



bac.ed on an estimated total eligible age population of students who are'avail-- 
hp .nJ.'^^';5°L fo^}7-year-olds.. those enrolled minus the 20% estimated to 
chnwn b'Jt unavailable in school. For completeness, figures are also 

-S?LaI iV'Lnl'J ?n iJ;5'7^'''' P^^l^^f O^ly Tndividually administered 
a?hi?"pment 1972-73 contained exercises used to measure changes in 

Figures ---17-year-olds include both a sample of 17-year-olds assessed 

'esDondP?t. tSi%^f?of'^"''5 ^5"° '."'"P^' r^'" "^^^^ to contact and assess non- 
.espondents the following day) and a sample assessed according to the new pro- 

Jleg'rn inH 7^ ^'^^^^ nonrespondents). Since the 

1969-70 and .1972-73 samples did not include follow-up attempts, chanqes in 
percentages between assessments c;re based upon the 1976-77 sample that does not 
include the follow-up atte,-pts. Changes toward future years wi 1 be basld 
upon the sample that doe^ 'nclude follow-up attempts • ' oasea 



78 

GOVERNMENT 'PRINTING OFFICE 1979 - 679-2'l'l/38l Reg. 8 P 



NATIOWAL ASSESSMENT OF EDUCATIONAL PROGRESS 
Education Cornmission of the States 

Dixy Lee Ray, Governor of Washington, Chairperson, Education Commission of the States 
Warren G. Hill, Executive Director, Education Commission of the States 
Roy H. Forbes, Director, National Assessment 

All National Assessment reports and publications are available through NAEP offices atcthe address shown at tiie 
' bottorn. Some of the more recent results reports are also available at the Superintendent oJ Documents (SOph usually , 
at lower prices. To order from the SOD, write to Supt. of Documents, U.S. Government Printing Office, Washington, 
D.C. 20402^ Check must accompany order. Allow four to eight weeks for delivery. 

Reports ordered^from National Assessment should be delivered within 12 days. Reports related to this report and 
available from National Assessment include: 

BASIC (JFE SKILLS (special probe) 

1st Assessment (1976-77) (toon 
08-BLS-21 Basic Life Skills Technical Report, fK\iu\^Ql^ * a.ou 

ADULTS (special probe) 

1st Assessment (1976-77) " , , 

08-H-01 Checkup: A National Assessment of Health Awareness Among 17'Year-Olds and Young Adults, 

September 1978 

08-E-01 Energy: Knowledge and Attitudes, A National Assessment of Energy Awareness Among Young 

Adults, December 1978 

• — Technical Information and Data From the 1977 Young Adult Assessment of Health, Energy and 

Reading, March 1979 



3.75 
3.75 
15.00 



2.70 



SCIENCE 

1st Assessment (1969-70) g 

Report 1 Science: National Results, July 1970 * 
Report 4 Science. Results by sex, region and size of community, April 1971 

Report 7 Science. Results by race, parental education, size and type of community; also balanced results 

for all groups. May 1973 

2nd Assessment (1972-73) ^ . \>, in-,c 1 An 

04-S^01 ' Selected Results From the National Assessments of Science: Energy Questions, May 1 975 1 .^o 
04'S-02 Selected Results From the Na tional Assessmen ts of Science: Scientific Principles and 

, Procedures, Augu5\^97S , ' o*>ic 

' 04-S.03 Selected Results From the National Assessments of Science: A ttitude Questions, October 1 975 ^-^^^ 

04-S-OO ♦ National Assessments of Science, 1969 and_ 1973: A Capsule Description of Changes m Science ^ 

Achievement, February 1975 oc nn 

04.S-20 • Changes in Science Performance, 1969-73: Exercise Volume, December 2b.00 

04^-20 Changes in Science Performance, 1969-73: Exercise Volume, Appendix (2 vols.), April 1 977 2b.00 
04-S-21 Science Technical Report: Summary Volume, May 1977 

BRS'1 ^ Science Achievement: Racial and Regional Trends, 1969-73, March 1976 3.9b 

3rd Assessment (1976-77) ' . 

C8-S-00 Three National Assessments of Science: Changes in Achievement, 1969-77, June 197« z.w 
08.S-01 Science Achievement in the Schools: A Summary of Results From the 1976-77 National 

Asse^ment of Science, Decemhkx y^l^ incn 

08-S.21 Three Assessments of Science, 1969-77: Technical Summary, April 1979 / 10.60 

. The Third Assessment of Science, 1976-77: Released Exercise Set, May 1 978 1 8.90 

Technical Appendix to the Third Assessment of Science, 1976-77: Released Exercise Set, 

December 1978 



2.75 



4.45 
2.50' 



BACKGROUND REPORTS 

BR.2 Hispanic Student Achievement in Five Learning Areas: 1971-75. Data for 9-, 1 3- and 1 7-year-olds 

in reading, mathematics, science, social studies and career and occupational development. 
May 1977' ^ 

03/04-GIY General Information Yearbook. A condensed description of the Assessment s methodology, 
December 1 974 

In addition to the above reports. National Assessment has produced reports in the areas of social studies, citizenship, 
writing, literature, reading, music, art and career and occupational development. A complete publications list and 
ordering information are availabje from the address below. 

NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS 
^ Suite 700, 1860 Lincoln Street 
Denver, Colorado 80295 



